nvflare.app_opt.xgboost.recipes.bagging module
- class XGBBaggingRecipe(name: str, min_clients: int, training_mode: str = 'bagging', num_rounds: int | None = None, num_client_bagging: int | None = None, num_local_parallel_tree: int = 1, local_subsample: float = 0.8, learning_rate: float = 0.1, objective: str = 'binary:logistic', max_depth: int = 8, eval_metric: str = 'auc', tree_method: str = 'hist', use_gpus: bool = False, nthread: int = 16, lr_mode: str = 'uniform', save_name: str = 'xgboost_model.json', data_loader_id: str = 'dataloader', per_site_config: dict[str, dict] | None = None)[source]
Bases:
RecipeXGBoost Tree-Based Recipe for federated learning (supports Bagging and Cyclic modes).
This recipe implements tree-based federated XGBoost with two training modes: - Bagging: Each client trains a local sub-forest, aggregated on server (federated Random Forest) - Cyclic: Clients train sequentially in rounds, each contributing to the global model
- Parameters:
name (str) – Name of the federated job.
min_clients (int) – The minimum number of clients for the job.
training_mode (str, optional) – Training mode (“bagging” or “cyclic”). Default is “bagging”.
num_rounds (int, optional) – Number of training rounds. Default is 1 for bagging, 100 for cyclic.
num_client_bagging (int, optional) – Number of clients for bagging. Default is min_clients.
num_local_parallel_tree (int, optional) – Number of parallel trees per client. Default is 1.
local_subsample (float, optional) – Subsample ratio for local training. Default is 0.8.
learning_rate (float, optional) – Learning rate for XGBoost. Default is 0.1.
objective (str, optional) – Learning objective. Default is “binary:logistic”.
max_depth (int, optional) – Maximum tree depth. Default is 8.
eval_metric (str, optional) – Evaluation metric. Default is “auc”.
tree_method (str, optional) – Tree construction method. Default is “hist”.
use_gpus (bool, optional) – Whether to use GPUs. Default is False.
nthread (int, optional) – Number of threads. Default is 16.
lr_mode (str, optional) – Learning rate mode (“uniform” or “scaled”). Default is “uniform”.
save_name (str, optional) – Model save name. Default is “xgboost_model.json”.
data_loader_id (str, optional) – ID of the data loader component. Default is “dataloader”.
per_site_config (dict, optional) – Per-site configuration mapping site names to config dicts. Each config dict must contain ‘data_loader’ key with XGBDataLoader instance. Can optionally include ‘lr_scale’ for scaled learning rate mode. Example: {“site-1”: {“data_loader”: CSVDataLoader(…), “lr_scale”: 0.5}, “site-2”: {…}}
Example
from nvflare.app_opt.xgboost.recipes import XGBBaggingRecipe from nvflare.app_opt.xgboost.histogram_based_v2.csv_data_loader import CSVDataLoader from nvflare.recipe import SimEnv # Bagging mode (federated Random Forest) with uniform learning rate recipe = XGBBaggingRecipe( name="random_forest", min_clients=3, training_mode="bagging", num_rounds=1, num_local_parallel_tree=5, local_subsample=0.5, per_site_config={ "site-1": {"data_loader": CSVDataLoader(folder="/tmp/data")}, "site-2": {"data_loader": CSVDataLoader(folder="/tmp/data")}, "site-3": {"data_loader": CSVDataLoader(folder="/tmp/data")}, }, ) # Or with scaled learning rate (data-size dependent) recipe = XGBBaggingRecipe( name="random_forest_scaled", min_clients=3, training_mode="bagging", lr_mode="scaled", per_site_config={ "site-1": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.5}, "site-2": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.3}, "site-3": {"data_loader": CSVDataLoader(folder="/tmp/data"), "lr_scale": 0.2}, }, ) env = SimEnv(num_clients=3) run = recipe.execute(env)
This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.
- Parameters:
job – the job that implements the recipe.