nvflare.app_opt.sklearn package¶
Subpackages¶
Submodules¶
Module contents¶
- class KMeansFedAvgRecipe(*, name: str = 'kmeans_fedavg', min_clients: int, num_rounds: int = 5, n_clusters: int = 3, train_script: str, train_args: str | Dict[str, str] = '', launch_external_process: bool = False, command: str = 'python3 -u')[source]¶
Bases:
RecipeA recipe for Federated K-Means Clustering with Scikit-learn.
This recipe implements federated K-Means clustering using a mini-batch aggregation strategy. The aggregation follows the scheme defined in MiniBatchKMeans where each client’s results are treated as a mini-batch for updating global centers.
The recipe configures: - A federated job with initial n_clusters parameter - Scatter-and-gather controller for coordinating training rounds - Custom KMeansAssembler for mini-batch center aggregation - CollectAndAssembleAggregator for combining client updates - Script runners for client-side training execution
Training Process: - Round 0: Each client generates initial centers using k-means++. The server
collects all initial centers and performs one round of k-means to generate the initial global centers.
Subsequent rounds: Each client trains a local MiniBatchKMeans model starting from global centers. The server aggregates center and count information to update global centers using the mini-batch update rule.
- Parameters:
name – Name of the federated learning job. Defaults to “kmeans_fedavg”.
min_clients – Minimum number of clients required to start a training round.
num_rounds – Number of federated training rounds to execute. Defaults to 5.
n_clusters – Number of clusters for K-Means. Defaults to 3.
train_script – Path to the training script that will be executed on each client.
train_args – Command line arguments to pass to the training script. Can be: - str: Same arguments for all clients (uses job.to_clients) - dict[str, str]: Per-client arguments mapping site names to args (uses job.to per site)
launch_external_process – Whether to launch the script in external process. Defaults to False.
command – If launch_external_process=True, command to run script (prepended to script). Defaults to “python3 -u”.
Example
```python recipe = KMeansFedAvgRecipe(
name=”kmeans_iris”, min_clients=3, num_rounds=5, n_clusters=3, train_script=”src/kmeans_fl.py”, train_args=”–data_path /tmp/data/iris.csv –train_start 0 –train_end 50”,
)
from nvflare.recipe import SimEnv env = SimEnv(num_clients=3) run = recipe.execute(env) print(“Result:”, run.get_result()) ```
Note
This recipe uses a custom KMeansAssembler that implements the mini-batch K-Means aggregation logic. The assembler maintains historical center and count information across rounds for proper weighted averaging.
This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.
- Parameters:
job – the job that implements the recipe.
- class SVMFedAvgRecipe(*, name: str = 'svm_fedavg', min_clients: int, kernel: str = 'rbf', train_script: str, train_args: str | Dict[str, str] = '', backend: str = 'sklearn', launch_external_process: bool = False, command: str = 'python3 -u')[source]¶
Bases:
RecipeA recipe for Federated SVM with Scikit-learn.
This recipe implements federated SVM training using support vector aggregation. Unlike iterative algorithms, SVM training only requires one round: - Round 0: Each client trains a local SVM and sends their support vectors - Server aggregates all support vectors and trains a global SVM - Round 1: Clients validate using the global support vectors
The recipe configures: - A federated job with kernel parameter - Scatter-and-gather controller (2 rounds) - Custom SVMAssembler for support vector aggregation - CollectAndAssembleAggregator for combining client updates - Script runners for client-side training execution
Training Process: - Round 0 (Training): Each client trains a local SVM on their data and extracts
support vectors. The server collects all support vectors, trains a global SVM, and extracts the global support vectors.
Round 1 (Validation): Each client validates using the global support vectors.
- Parameters:
name – Name of the federated learning job. Defaults to “svm_fedavg”.
min_clients – Minimum number of clients required to start a training round.
kernel – Kernel type for SVM. Options: ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’. Defaults to ‘rbf’.
train_script – Path to the training script that will be executed on each client.
train_args – Command line arguments to pass to the training script. Can be: - str: Same arguments for all clients (uses job.to_clients) - dict[str, str]: Per-client arguments mapping site names to args (uses job.to per site)
backend – Backend library to use (‘sklearn’ or ‘cuml’). Defaults to ‘sklearn’.
launch_external_process – Whether to launch the script in external process. Defaults to False.
command – If launch_external_process=True, command to run script (prepended to script). Defaults to “python3 -u”.
Example
```python recipe = SVMFedAvgRecipe(
name=”svm_cancer”, min_clients=3, kernel=”rbf”, train_script=”client.py”, train_args=”–data_path /tmp/data/cancer.csv –train_start 0 –train_end 100”,
)
from nvflare.recipe import SimEnv env = SimEnv(num_clients=3) run = recipe.execute(env) print(“Result:”, run.get_result()) ```
Note
This recipe uses a custom SVMAssembler that implements support vector aggregation. The training only requires one round since SVM is not an iterative algorithm in the federated setting. A second round is included for validation purposes.
This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.
- Parameters:
job – the job that implements the recipe.
- class SklearnFedAvgRecipe(*, name: str = 'sklearn_fedavg', min_clients: int, num_rounds: int = 2, model_params: dict | None = None, train_script: str, train_args: str | Dict[str, str] = '', aggregator: Aggregator | None = None, aggregator_data_kind: DataKind = DataKind.WEIGHTS, launch_external_process: bool = False, command: str = 'python3 -u')[source]¶
Bases:
RecipeA recipe for implementing Federated Averaging (FedAvg) with Scikit-learn.
This recipe sets up a complete federated learning workflow with scatter-and-gather communication pattern specifically designed for scikit-learn models.
The recipe configures: - A federated job with initial parameters - Scatter-and-gather controller for coordinating training rounds - Weighted aggregator for combining client model updates (or custom aggregator) - Script runners for client-side training execution
- Parameters:
name – Name of the federated learning job. Defaults to “sklearn_fedavg”.
min_clients – Minimum number of clients required to start a training round.
num_rounds – Number of federated training rounds to execute. Defaults to 2.
model_params – Model hyperparameters as a dictionary. For SGDClassifier, can include: n_classes, learning_rate, eta0, loss, penalty, fit_intercept, etc. Can also include initial weights if needed.
train_script – Path to the training script that will be executed on each client.
train_args – Command line arguments to pass to the training script. Can be: - str: Same arguments for all clients (uses job.to_clients) - dict[str, str]: Per-client arguments mapping site names to args (uses job.to per site)
aggregator – Custom aggregator for combining client updates. If None, uses InTimeAccumulateWeightedAggregator with aggregator_data_kind.
aggregator_data_kind – Data kind to use for the aggregator. Defaults to DataKind.WEIGHTS.
launch_external_process – Whether to launch the script in external process. Defaults to False.
command – If launch_external_process=True, command to run script (prepended to script). Defaults to “python3 -u”.
Example
```python recipe = SklearnFedAvgRecipe(
name=”sklearn_linear”, min_clients=5, num_rounds=50, model_params={
“n_classes”: 2, “learning_rate”: “constant”, “eta0”: 1e-4, “loss”: “log_loss”, “penalty”: “l2”, “fit_intercept”: 1,
}, train_script=”client.py”, train_args=”–data_path /tmp/data/HIGGS.csv”,
)
from nvflare.recipe import SimEnv env = SimEnv(num_clients=5) run = recipe.execute(env) print(“Result:”, run.get_result()) ```
Note
By default, this recipe implements the standard FedAvg algorithm where model updates are aggregated using weighted averaging based on the number of training samples provided by each client.
If you want to use a custom aggregator, you can pass it in the aggregator parameter. The custom aggregator must be a subclass of the Aggregator class.
This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.
- Parameters:
job – the job that implements the recipe.