nvflare.app_opt.xgboost.recipes.bagging module

class XGBBaggingRecipe(name: str, min_clients: int, training_mode: str = 'bagging', num_rounds: int | None = None, num_client_bagging: int | None = None, num_local_parallel_tree: int = 1, local_subsample: float = 0.8, learning_rate: float = 0.1, objective: str = 'binary:logistic', max_depth: int = 8, eval_metric: str = 'auc', tree_method: str = 'hist', use_gpus: bool = False, nthread: int = 16, lr_mode: str = 'uniform', save_name: str = 'xgboost_model.json', data_loader_id: str = 'dataloader', per_site_config: dict[str, dict] | None = None)[source]

Bases: Recipe

XGBoost Tree-Based Recipe for federated learning (supports Bagging and Cyclic modes).

Recipe parameters, including xgb_params and nested per_site_config values, must never contain actual secrets. Read secrets from site environment variables or mounted files; references are supported only where documented in nvflare.recipe.secrets.

This recipe implements tree-based federated XGBoost with two training modes: - Bagging: Each client trains a local sub-forest, aggregated on server (federated Random Forest) - Cyclic: Clients train sequentially in rounds, each contributing to the global model

Parameters:

name (str) – Name of the federated job.
min_clients (int) – The minimum number of clients for the job.
training_mode (str, optional) – Training mode (“bagging” or “cyclic”). Default is “bagging”.
num_rounds (int, optional) – Number of training rounds. Default is 1 for bagging, 100 for cyclic.
num_client_bagging (int, optional) – Number of clients for bagging. Default is min_clients.
num_local_parallel_tree (int, optional) – Number of parallel trees per client. Default is 1.
local_subsample (float, optional) – Subsample ratio for local training. Default is 0.8.
learning_rate (float, optional) – Learning rate for XGBoost. Default is 0.1.
objective (str, optional) – Learning objective. Default is “binary:logistic”.
max_depth (int, optional) – Maximum tree depth. Default is 8.
eval_metric (str, optional) – Evaluation metric. Default is “auc”.
tree_method (str, optional) – Tree construction method. Default is “hist”.
use_gpus (bool, optional) – Whether to use GPUs. Default is False.
nthread (int, optional) – Number of threads. Default is 16.
lr_mode (str, optional) – Learning rate mode (“uniform” or “scaled”). Default is “uniform”.
save_name (str, optional) – Model save name. Default is “xgboost_model.json”.
data_loader_id (str, optional) – ID of the data loader component. Default is “dataloader”.
per_site_config (dict, optional) – Deprecated constructor form of per-site configuration. New code should call set_per_site_config(recipe, config) immediately after construction.

Example

from nvflare.app_opt.xgboost.recipes import XGBBaggingRecipe
from nvflare.app_opt.xgboost.histogram_based_v2.csv_data_loader import CSVDataLoader
from nvflare.recipe import SimEnv, set_per_site_config

# Bagging mode (federated Random Forest) with uniform learning rate
recipe = XGBBaggingRecipe(
    name="random_forest",
    min_clients=3,
    training_mode="bagging",
    num_rounds=1,
    num_local_parallel_tree=5,
    local_subsample=0.5,
)
set_per_site_config(
    recipe,
    {
        "site-1": {"data_loader": CSVDataLoader(folder="/tmp/data")},
        "site-2": {"data_loader": CSVDataLoader(folder="/tmp/data")},
        "site-3": {"data_loader": CSVDataLoader(folder="/tmp/data")},
    },
)

# Or with scaled learning rate (data-size dependent)
recipe = XGBBaggingRecipe(
    name="random_forest_scaled",
    min_clients=3,
    training_mode="bagging",
    lr_mode="scaled",
)
set_per_site_config(
    recipe,
    {
        "site-1": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.5},
        "site-2": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.3},
        "site-3": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.2},
    },
)

env = SimEnv(num_clients=3)
run = recipe.execute(env)

This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.

Security contract – no secrets in recipe parameters:

Recipe parameters (train_args, task_args, eval_args, per_site_config, config overrides, dicts passed to add_client_config/add_server_config, exec params, etc.) can be written in clear text into generated job configuration. These parameters and their nested values must never contain actual passwords, API keys, tokens, private keys, or other credentials. Instead, read secrets from site environment variables or mounted secret files inside your code, or pass a placeholder created with nvflare.recipe.secrets.secret_ref() or nvflare.recipe.secrets.secret_file_ref() at a supported runtime boundary. See nvflare.recipe.secrets for the supported parameter locations.

Before export or run, recipes scan their parameters with heuristics and emit nvflare.recipe.secrets.PotentialSecretWarning when a value looks like an actual secret. The scan is best-effort: absence of a warning does not prove a parameter is safe to share.

Parameters:: job – the job that implements the recipe.

configure()[source]: Configure the federated job for XGBoost tree-based training.