nvflare.edge.tools.et_fed_buff_recipe module
- class ETFedBuffRecipe(job_name: str, device_model: DeviceModel, input_shape, output_shape, model_manager_config: ModelManagerConfig, device_manager_config: DeviceManagerConfig, initial_ckpt: str | None = None, evaluator_config: EvaluatorConfig | None = None, simulation_config: SimulationConfig | None = None, device_training_params: Dict | None = None, custom_source_root: str | None = None, device_wait_timeout: float | None = None)[source]
Bases:
EdgeFedBuffRecipeEdge Training FedBuff Recipe for embedded/edge device training.
This recipe extends EdgeFedBuffRecipe for edge devices with DeviceModel wrapper.
- Parameters:
job_name – Name of the federated learning job.
device_model – DeviceModel wrapping the PyTorch model for edge devices.
input_shape – Input shape for the model.
output_shape – Output shape for the model.
model_manager_config – Configuration for the model manager.
device_manager_config – Configuration for the device manager.
initial_ckpt – Absolute path to a pre-trained checkpoint file (.pt, .pth). The file may not exist locally (server-side path).
evaluator_config – Configuration for the global evaluator (optional).
simulation_config – Configuration for simulated devices settings (optional).
device_training_params – Training parameters for device (optional).
custom_source_root – Path to custom source code (optional).
device_wait_timeout – Timeout in seconds for waiting for sufficient devices to join before stopping the job. None means wait indefinitely. WARNING: when device_reuse=False with a finite device pool, leaving this as None can cause the job to hang indefinitely once the pool is exhausted. In that case, set an explicit timeout (e.g., 300.0 seconds). Default: None
This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.
- Parameters:
job – the job that implements the recipe.