nvflare.app_opt.xgboost.recipes.bagging module

class XGBBaggingRecipe(name: str, min_clients: int, training_mode: str = 'bagging', num_rounds: int | None = None, num_client_bagging: int | None = None, num_local_parallel_tree: int = 1, local_subsample: float = 0.8, learning_rate: float = 0.1, objective: str = 'binary:logistic', max_depth: int = 8, eval_metric: str = 'auc', tree_method: str = 'hist', use_gpus: bool = False, nthread: int = 16, lr_mode: str = 'uniform', save_name: str = 'xgboost_model.json', data_loader_id: str = 'dataloader', per_site_config: dict[str, dict] | None = None)[source]

Bases: Recipe

XGBoost Tree-Based Recipe for federated learning (supports Bagging and Cyclic modes).

This recipe implements tree-based federated XGBoost with two training modes: - Bagging: Each client trains a local sub-forest, aggregated on server (federated Random Forest) - Cyclic: Clients train sequentially in rounds, each contributing to the global model

Parameters:
  • name (str) – Name of the federated job.

  • min_clients (int) – The minimum number of clients for the job.

  • training_mode (str, optional) – Training mode (“bagging” or “cyclic”). Default is “bagging”.

  • num_rounds (int, optional) – Number of training rounds. Default is 1 for bagging, 100 for cyclic.

  • num_client_bagging (int, optional) – Number of clients for bagging. Default is min_clients.

  • num_local_parallel_tree (int, optional) – Number of parallel trees per client. Default is 1.

  • local_subsample (float, optional) – Subsample ratio for local training. Default is 0.8.

  • learning_rate (float, optional) – Learning rate for XGBoost. Default is 0.1.

  • objective (str, optional) – Learning objective. Default is “binary:logistic”.

  • max_depth (int, optional) – Maximum tree depth. Default is 8.

  • eval_metric (str, optional) – Evaluation metric. Default is “auc”.

  • tree_method (str, optional) – Tree construction method. Default is “hist”.

  • use_gpus (bool, optional) – Whether to use GPUs. Default is False.

  • nthread (int, optional) – Number of threads. Default is 16.

  • lr_mode (str, optional) – Learning rate mode (“uniform” or “scaled”). Default is “uniform”.

  • save_name (str, optional) – Model save name. Default is “xgboost_model.json”.

  • data_loader_id (str, optional) – ID of the data loader component. Default is “dataloader”.

  • per_site_config (dict, optional) – Per-site configuration mapping site names to config dicts. Each config dict must contain ‘data_loader’ key with XGBDataLoader instance. Can optionally include ‘lr_scale’ for scaled learning rate mode. Example: {“site-1”: {“data_loader”: CSVDataLoader(…), “lr_scale”: 0.5}, “site-2”: {…}}

Example

from nvflare.app_opt.xgboost.recipes import XGBBaggingRecipe
from nvflare.app_opt.xgboost.histogram_based_v2.csv_data_loader import CSVDataLoader
from nvflare.recipe import SimEnv

# Bagging mode (federated Random Forest) with uniform learning rate
recipe = XGBBaggingRecipe(
    name="random_forest",
    min_clients=3,
    training_mode="bagging",
    num_rounds=1,
    num_local_parallel_tree=5,
    local_subsample=0.5,
    per_site_config={
        "site-1": {"data_loader": CSVDataLoader(folder="/tmp/data")},
        "site-2": {"data_loader": CSVDataLoader(folder="/tmp/data")},
        "site-3": {"data_loader": CSVDataLoader(folder="/tmp/data")},
    },
)

# Or with scaled learning rate (data-size dependent)
recipe = XGBBaggingRecipe(
    name="random_forest_scaled",
    min_clients=3,
    training_mode="bagging",
    lr_mode="scaled",
    per_site_config={
        "site-1": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.5},
        "site-2": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.3},
        "site-3": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.2},
    },
)

env = SimEnv(num_clients=3)
run = recipe.execute(env)

This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.

Parameters:

job – the job that implements the recipe.

configure()[source]

Configure the federated job for XGBoost tree-based training.