nvflare.app_common.workflows.base_fedavg module

class BaseFedAvg(*args, **kwargs)[source]

Bases: ExperimentalClass

The base controller for FedAvg Workflow. Note: This class is based on the experimental ModelController.

Implements [FederatedAveraging](https://arxiv.org/abs/1602.05629). The model persistor (persistor_id) is used to load the initial global model which is sent to a list of clients. Each client sends it’s updated weights after local training which is aggregated. Next, the global model is updated. The model_persistor also saves the model after training.

Provides the default implementations for the follow routines:
  • def sample_clients(self, min_clients)

  • def aggregate(self, results: List[FLModel], aggregate_fn=None) -> FLModel

  • def update_model(self, aggr_result)

The run routine needs to be implemented by the derived class:

  • def run(self)

FLModel based controller.

Parameters:
  • min_clients (int, optional) – The minimum number of clients responses before Workflow starts to wait for wait_time_after_min_received. Note that the workflow will move forward when all available clients have responded regardless of this value. Defaults to 1000.

  • num_rounds (int, optional) – The total number of training rounds. Defaults to 5.

  • persistor_id (str, optional) – ID of the persistor component. Defaults to “persistor”.

  • ignore_result_error (bool, optional) – whether this controller can proceed if client result has errors. Defaults to False.

  • allow_empty_global_weights (bool, optional) – whether to allow empty global weights. Some pipelines can have empty global weights at first round, such that clients start training from scratch without any global info. Defaults to False.

  • task_check_period (float, optional) – interval for checking status of tasks. Defaults to 0.5.

  • persist_every_n_rounds (int, optional) – persist the global model every n rounds. Defaults to 1. If n is 0 then no persist.

aggregate(results: List[FLModel], aggregate_fn=None) FLModel[source]

Called by the run routine to aggregate the training results of clients.

Parameters:
  • results – a list of FLModel containing training results of the clients.

  • aggregate_fn – a function that turns the list of FLModel into one resulting (aggregated) FLModel.

Returns: aggregated FLModel.

sample_clients(min_clients)[source]

Called by the run routine to get a list of available clients.

Parameters:

min_clients – number of clients to return.

Returns: list of clients.

update_model(aggr_result)[source]

Called by the run routine to update the current global model (self.model) given the aggregated result.

Parameters:

aggr_result – aggregated FLModel.

Returns: None.