nvflare.app_opt.xgboost.recipes.histogram module

class XGBHorizontalRecipe(name: str, min_clients: int, num_rounds: int, early_stopping_rounds: int = 2, use_gpus: bool = False, secure: bool = False, client_ranks: dict | None = None, xgb_params: dict | None = None, data_loader_id: str = 'dataloader', metrics_writer_id: str = 'metrics_writer', per_site_config: dict[str, dict] | None = None)[source]

Bases: Recipe

XGBoost Horizontal Federated Learning Recipe.

This recipe implements horizontal federated XGBoost using histogram-based algorithms. In horizontal federated learning, each client has different samples with the same features. The histogram-based approach enables efficient gradient boosting by computing histograms of gradients and hessians collaboratively across clients.

Parameters:
  • name (str) – Name of the federated job.

  • min_clients (int) – The minimum number of clients for the job.

  • num_rounds (int) – Number of boosting rounds.

  • early_stopping_rounds (int, optional) – Early stopping rounds. Default is 2.

  • use_gpus (bool, optional) – Whether to use GPUs for training. Default is False.

  • secure (bool, optional) – Enable secure training with Homomorphic Encryption (HE). Default is False. Requires encryption plugins to be installed and configured. When secure=True, client_ranks must be provided.

  • client_ranks (dict, optional) – Mapping of client names to ranks for secure training. Required when secure=True. Maps each client name to a unique rank (0-indexed). Example: {“site-1”: 0, “site-2”: 1, “site-3”: 2}.

  • xgb_params (dict, optional) – XGBoost parameters passed to xgboost.train(). If None, uses default params. Default params: max_depth=8, eta=0.1, objective=’binary:logistic’, eval_metric=’auc’, tree_method=’hist’, nthread=16.

  • data_loader_id (str, optional) – ID of the data loader component. Default is ‘dataloader’.

  • metrics_writer_id (str, optional) – ID of the metrics writer component. Default is ‘metrics_writer’.

  • per_site_config (dict) – Per-site configuration mapping site names to config dicts. Each config dict must contain ‘data_loader’ key with XGBDataLoader instance. Example: {“site-1”: {“data_loader”: CSVDataLoader(…)}, “site-2”: {…}}

Example

from nvflare.app_opt.xgboost.recipes import XGBHorizontalRecipe
from nvflare.app_opt.xgboost.histogram_based_v2.csv_data_loader import CSVDataLoader
from nvflare.recipe import SimEnv

# Build per-site configuration with data loaders
per_site_config = {
    "site-1": {"data_loader": CSVDataLoader(folder="/tmp/data/horizontal_xgb_data")},
    "site-2": {"data_loader": CSVDataLoader(folder="/tmp/data/horizontal_xgb_data")},
}

# Create recipe
recipe = XGBHorizontalRecipe(
    name="xgb_higgs_horizontal",
    min_clients=2,
    num_rounds=100,
    xgb_params={
        "max_depth": 8,
        "eta": 0.1,
        "objective": "binary:logistic",
        "eval_metric": "auc",
    },
    per_site_config=per_site_config,
)

# Run simulation with explicit client list
clients = list(per_site_config.keys())
env = SimEnv(clients=clients)
run = recipe.execute(env)

Note

  • Data loaders must be configured via per_site_config parameter.

  • TensorBoard tracking is automatically configured for both server and clients.

  • Executor and metrics components are automatically added to all clients.

This is base class of a recipe. Recipes are implemented by jobs. A concrete recipe must provide the job for recipe implementation.

Parameters:

job – the job that implements the recipe.

configure()[source]

Configure the federated job for XGBoost histogram-based training.