Available Recipes
NVFlare provides a variety of pre-built recipes for common federated learning algorithms and workflows. Recipes are high-level, declarative APIs that simplify job configuration and execution.
Common Recipe Parameters
Most training recipes accept the following model-related parameters:
modelThe model to use for federated training. Accepts:
Class instance: e.g.,
MyModel()- convenient and PythonicDict config: e.g.,
{"class_path": "module.MyModel", "args": {"param": value}}- better for large models
Note
Class instances are converted to configuration files before job submission. For large models, use dict config to avoid unnecessary instantiation overhead. For TensorFlow/Keras, class instances should be user-defined subclassed models (for example,
tf.keras.Modelortf.keras.Sequentialsubclasses).initial_ckptAbsolute path to a pre-trained checkpoint file. The file may not exist locally but must exist on the server when the model is loaded during job execution.
PyTorch: Requires
modelfor architecture (checkpoint has weights only)TensorFlow/Keras: Can use
initial_ckptalone (Keras saves full model). Ifmodelis provided, use a subclassed Keras class instance or dict config.
enable_tensor_disk_offload(PyTorch FedAvg recipes)Controls where streamed PyTorch tensors are materialized during server-side aggregation.
False(default): materialize in memoryTrue: materialize to temporary safetensors files and consume through lazy refs to reduce peak memory
Warning
Temporary files use the server process temp directory (
TMPDIR/ OS default such as/tmp). The server IT setup must point this to a writable, disk-backed mount. In containers or Kubernetes,/tmpmay be RAM-backed, which prevents memory offload benefits. See Starting Federated Learning Servers.
See NVFlare Job Recipe for detailed explanations of these options.
Federated Averaging (FedAvg)
The most fundamental federated learning algorithm that aggregates model updates from multiple clients by computing a weighted average.
PyTorch FedAvg
from nvflare.app_opt.pt.recipes import FedAvgRecipe
from nvflare.recipe import SimEnv
recipe = FedAvgRecipe(
name="fedavg-pt",
min_clients=2,
num_rounds=5,
model=MyModel(),
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
TensorFlow FedAvg
from nvflare.app_opt.tf.recipes import FedAvgRecipe
from nvflare.recipe import SimEnv
recipe = FedAvgRecipe(
name="fedavg-tf",
min_clients=2,
num_rounds=5,
model=MyTFModel(),
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
NumPy FedAvg
For framework-agnostic or NumPy-based models.
from nvflare.app_common.np.recipes import NumpyFedAvgRecipe
from nvflare.recipe import SimEnv
recipe = NumpyFedAvgRecipe(
name="fedavg-numpy",
min_clients=2,
num_rounds=5,
model=[0.0, 0.0, 0.0],
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
Sklearn FedAvg
For scikit-learn based models.
from nvflare.app_opt.sklearn.recipes import SklearnFedAvgRecipe
from nvflare.recipe import SimEnv
recipe = SklearnFedAvgRecipe(
name="fedavg-sklearn",
min_clients=2,
num_rounds=5,
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
FedAvg with Homomorphic Encryption
FedAvg with secure aggregation using homomorphic encryption.
from nvflare.app_opt.pt.recipes import FedAvgRecipeWithHE
from nvflare.recipe import ProdEnv
recipe = FedAvgRecipeWithHE(
name="fedavg-he",
min_clients=2,
num_rounds=5,
model=MyModel(),
train_script="client.py",
)
env = ProdEnv(
startup_kit_location="/path/to/startup_kit/admin@nvidia.com",
username="admin@nvidia.com",
)
run = recipe.execute(env)
Note
FedAvgRecipeWithHE requires provisioned startup kits with homomorphic encryption context files.
Use ProdEnv or PocEnv with HE provisioning; SimEnv is not supported.
Examples:
FedProx
FedProx is FedAvg with a proximal term added to the client loss function to handle data heterogeneity. It uses the standard FedAvgRecipe with the FedProx loss helper on the client side.
PyTorch FedProx
from nvflare.app_opt.pt.recipes import FedAvgRecipe
from nvflare.recipe import SimEnv
# FedProx uses FedAvgRecipe with FedProxLoss in the client training script
recipe = FedAvgRecipe(
name="fedprox-pt",
min_clients=2,
num_rounds=5,
model=MyModel(),
train_script="client.py",
train_args="--fedproxloss_mu 0.01", # Pass mu parameter to client
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
In your client training script, use the FedProxLoss helper:
from nvflare.app_opt.pt import PTFedProxLoss
# In training loop:
fedprox_loss = PTFedProxLoss(mu=fedproxloss_mu)
for data, target in train_loader:
optimizer.zero_grad()
output = model(data)
ce_loss = criterion(output, target)
# Add FedProx regularization term
prox_loss = fedprox_loss(model)
loss = ce_loss + prox_loss
loss.backward()
optimizer.step()
Examples:
TensorFlow FedProx
from nvflare.app_opt.tf.recipes import FedAvgRecipe
from nvflare.recipe import SimEnv
recipe = FedAvgRecipe(
name="fedprox-tf",
min_clients=2,
num_rounds=5,
model=MyTFModel(),
train_script="client.py",
train_args="--fedproxloss_mu 0.01",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
In your client training script, use the TensorFlow FedProxLoss:
from nvflare.app_opt.tf.fedprox_loss import TFFedProxLoss
fedprox_loss = TFFedProxLoss(mu=fedproxloss_mu)
# Use in training loop
Examples:
FedOpt (Federated Optimization)
Federated optimization with server-side optimizer (e.g., SGD, Adam).
PyTorch FedOpt
from nvflare.app_opt.pt.recipes import FedOptRecipe
from nvflare.recipe import SimEnv
recipe = FedOptRecipe(
name="fedopt-pt",
min_clients=2,
num_rounds=5,
model=MyModel(),
train_script="client.py",
optimizer_args={"path": "torch.optim.SGD", "args": {"lr": 1.0, "momentum": 0.6}},
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
TensorFlow FedOpt
from nvflare.app_opt.tf.recipes import FedOptRecipe
from nvflare.recipe import SimEnv
recipe = FedOptRecipe(
name="fedopt-tf",
min_clients=2,
num_rounds=5,
model=MyTFModel(),
train_script="client.py",
optimizer_args={"path": "tensorflow.keras.optimizers.SGD", "args": {"learning_rate": 1.0}},
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
SCAFFOLD
SCAFFOLD algorithm for handling data heterogeneity with control variates.
PyTorch SCAFFOLD
from nvflare.app_opt.pt.recipes import ScaffoldRecipe
from nvflare.recipe import SimEnv
recipe = ScaffoldRecipe(
name="scaffold-pt",
min_clients=2,
num_rounds=5,
model=MyModel(),
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
TensorFlow SCAFFOLD
from nvflare.app_opt.tf.recipes import ScaffoldRecipe
from nvflare.recipe import SimEnv
recipe = ScaffoldRecipe(
name="scaffold-tf",
min_clients=2,
num_rounds=5,
model=MyTFModel(),
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
Cyclic Learning
Sequential training across clients in a cyclic order.
PyTorch Cyclic
from nvflare.app_opt.pt.recipes import CyclicRecipe
from nvflare.recipe import SimEnv
recipe = CyclicRecipe(
name="cyclic-pt",
min_clients=2,
num_rounds=5,
model=MyModel(),
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
TensorFlow Cyclic
from nvflare.app_opt.tf.recipes import CyclicRecipe
from nvflare.recipe import SimEnv
recipe = CyclicRecipe(
name="cyclic-tf",
min_clients=2,
num_rounds=5,
model=MyTFModel(),
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
XGBoost Recipes
Federated XGBoost for tree-based models.
XGBoost Horizontal (Histogram-based)
Histogram-based federated XGBoost for horizontal data partitioning.
from nvflare.app_opt.xgboost.recipes import XGBHorizontalRecipe
from nvflare.recipe import SimEnv
recipe = XGBHorizontalRecipe(
name="xgb-horizontal",
min_clients=2,
num_rounds=10,
xgb_params={"max_depth": 6, "eta": 0.1, "objective": "binary:logistic"},
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
XGBoost Bagging (Tree-based)
Tree-based federated XGBoost using bagging.
from nvflare.app_opt.xgboost.recipes import XGBBaggingRecipe
from nvflare.recipe import SimEnv
recipe = XGBBaggingRecipe(
name="xgb-bagging",
min_clients=2,
training_mode="bagging",
num_rounds=10,
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
XGBoost Vertical
Federated XGBoost for vertical data partitioning.
from nvflare.app_opt.xgboost.recipes import XGBVerticalRecipe
from nvflare.recipe import SimEnv
recipe = XGBVerticalRecipe(
name="xgb-vertical",
min_clients=2,
num_rounds=10,
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
Sklearn Specialized Recipes
K-Means FedAvg
Federated K-Means clustering.
from nvflare.app_opt.sklearn.recipes import KMeansFedAvgRecipe
from nvflare.recipe import SimEnv
recipe = KMeansFedAvgRecipe(
name="kmeans",
min_clients=2,
num_rounds=5,
n_clusters=3,
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
SVM FedAvg
Federated Support Vector Machine.
from nvflare.app_opt.sklearn.recipes import SVMFedAvgRecipe
from nvflare.recipe import SimEnv
recipe = SVMFedAvgRecipe(
name="svm",
min_clients=2,
num_rounds=5,
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
Logistic Regression FedAvg
Federated Logistic Regression.
from nvflare.app_common.np.recipes.lr.fedavg import FedAvgLrRecipe
from nvflare.recipe import SimEnv
recipe = FedAvgLrRecipe(
name="lr",
min_clients=2,
num_rounds=5,
train_script="client.py",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
Federated Statistics
Compute federated statistics across distributed data.
from nvflare.recipe import SimEnv
from nvflare.recipe.fedstats import FedStatsRecipe
recipe = FedStatsRecipe(
name="stats",
stats_output_path="./output",
sites=["site-1", "site-2"],
statistic_configs={"count": {}, "mean": {}, "stddev": {}},
stats_generator=my_stats_generator,
)
env = SimEnv(clients=["site-1", "site-2"])
run = recipe.execute(env)
Examples:
Kaplan-Meier Survival Analysis
Federated Kaplan-Meier survival analysis with optional homomorphic encryption over binned event histograms.
The KMRecipe is defined in the Kaplan-Meier example’s job.py rather than exported as a package-level
recipe.
Run the snippet from the Kaplan-Meier example directory so from job import KMRecipe resolves correctly:
cd examples/advanced/kaplan-meier-he
from job import KMRecipe
from nvflare.recipe import SimEnv
# KMRecipe is defined in examples/advanced/kaplan-meier-he/job.py
recipe = KMRecipe(
num_clients=5,
encryption=True,
data_root="/tmp/nvflare/dataset/km_data",
he_context_path_client="/tmp/nvflare/he_context/he_context_client.txt",
he_context_path_server="/tmp/nvflare/he_context/he_context_server.txt",
)
env = SimEnv(num_clients=5)
run = recipe.execute(env)
Examples:
Federated Evaluation
Evaluate a pre-trained model across multiple sites.
PyTorch FedEval
Evaluate a pre-trained PyTorch model by sending it to all clients for evaluation on their local data.
from nvflare.app_opt.pt.recipes.fedeval import FedEvalRecipe
from nvflare.recipe import SimEnv
recipe = FedEvalRecipe(
name="eval_job",
model=MyModel(),
eval_ckpt="/path/to/pretrained_model.pt",
min_clients=2,
eval_script="client.py",
eval_args="--batch_size 32",
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Note
eval_ckpt is required. It can be either:
an absolute path on the server to the pre-trained checkpoint (.pt, .pth), or
a relative or absolute path to a local checkpoint file that will be bundled with the job (for example, via utilities such as
prepare_initial_ckpt).
When specifying an absolute server-side path, the checkpoint file may not exist locally when building the job.
Examples:
Cross-Site Evaluation
Evaluate models across all client sites (compare each client’s model against all datasets).
from nvflare.app_common.np.recipes import NumpyCrossSiteEvalRecipe
from nvflare.recipe import SimEnv
recipe = NumpyCrossSiteEvalRecipe(
name="cross-eval",
min_clients=2,
eval_script="evaluate.py",
eval_args="--data_root /path/to/data",
initial_ckpt="/path/to/pretrained_model.npy", # Optional: evaluate specific model
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Note
Use
eval_scriptto specify custom evaluation logic. If not provided, uses a built-in dummy validator (for testing only).Use
initial_ckptto evaluate a specific pre-trained model. If not provided, the recipe evaluates models from the training run directory.
Examples:
Private Set Intersection (PSI)
Compute intersection of private sets across clients.
from nvflare.app_common.psi.recipes import DhPSIRecipe
from nvflare.recipe import SimEnv
recipe = DhPSIRecipe(
name="psi",
min_clients=2,
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
Flower Integration
Run Flower-based federated learning jobs.
from nvflare.app_opt.flower.recipe import FlowerRecipe
from nvflare.recipe import SimEnv
recipe = FlowerRecipe(
name="flower-job",
min_clients=2,
flower_content="path/to/flower/app",
run_config={"num-server-rounds": 5}, # Optional: used to override default values in pyproject.toml
)
env = SimEnv(num_clients=2)
run = recipe.execute(env)
Examples:
Swarm Learning
Decentralized federated learning without a central server.
from nvflare.app_opt.pt.recipes.swarm import SwarmLearningRecipe
from nvflare.recipe import SimEnv
recipe = SwarmLearningRecipe(
name="swarm",
model=MyModel(),
min_clients=3,
num_rounds=5,
train_script="client.py",
initial_ckpt="/path/to/pretrained.pt", # Optional: pre-trained weights
round_timeout=3600, # P2P model-transfer ACK budget (seconds); increase for large models
)
env = SimEnv(num_clients=3)
run = recipe.execute(env)
Note
For large models (>2 GB), tune the following parameters:
round_timeout(default 3600 s): P2P model-transfer ACK budget between peers. Increase for 7B+ models where P2P tensor streaming can take several minutes.pipe_type(default"cell_pipe"): set to"file_pipe"when cell networking is unavailable or for third-party subprocess integrations.submit_result_timeout,download_complete_timeout,tensor_min_download_timeout, andPEER_READ_TIMEOUT: set viarecipe.add_client_config({...}).max_resendsdefaults to finite value3and can be overridden the same way — see Timeout Troubleshooting Guide.
Edge Recipes
Recipes for edge device federated learning.
EdgeFedBuffRecipe
from nvflare.edge.tools.edge_fed_buff_recipe import (
EdgeFedBuffRecipe,
ModelManagerConfig,
DeviceManagerConfig,
)
recipe = EdgeFedBuffRecipe(
job_name="edge-fedavg",
model=MyModel(),
model_manager_config=ModelManagerConfig(max_num_active_model_versions=3, max_model_version=20),
device_manager_config=DeviceManagerConfig(device_selection_size=100),
initial_ckpt="/path/to/pretrained.pt", # Optional: pre-trained weights
)
Examples:
Utility Functions
Add Experiment Tracking
Add experiment tracking (MLflow, TensorBoard, W&B) to any recipe.
from nvflare.recipe.utils import add_experiment_tracking
add_experiment_tracking(recipe, tracking_type="tensorboard")
# or
add_experiment_tracking(recipe, tracking_type="mlflow")
# or
add_experiment_tracking(recipe, tracking_type="wandb")
Add Cross-Site Evaluation
Add cross-site evaluation to any training recipe.
from nvflare.recipe.utils import add_cross_site_evaluation
add_cross_site_evaluation(recipe)
# or limit evaluation to selected clients
add_cross_site_evaluation(recipe, participating_clients=["site-1", "site-3"])
Execution Environments
Recipes can be executed in different environments:
SimEnv (Simulation)
Run locally for development and testing.
from nvflare.recipe import SimEnv
env = SimEnv(num_clients=2)
run = recipe.execute(env)
PocEnv (Proof of Concept)
Run with multiple processes on a single machine.
from nvflare.recipe import PocEnv
env = PocEnv(num_clients=2)
run = recipe.execute(env)
ProdEnv (Production)
Deploy to production NVFlare infrastructure.
from nvflare.recipe import ProdEnv
env = ProdEnv(startup_kit_location="/path/to/startup_kit")
run = recipe.execute(env)