nvflare.app_opt.feature_election package

Submodules

Module contents

Feature Election for NVIDIA FLARE

A plug-and-play horizontal federated feature selection framework for tabular datasets.

This module provides: - FeatureElection: High-level API for feature election - FeatureElectionController: Server-side FLARE controller - FeatureElectionExecutor: Client-side FLARE executor - Helper functions for quick deployment

Example

Basic usage:

from nvflare.app_opt.feature_election import quick_election
import pandas as pd

df = pd.read_csv("data.csv")
selected_mask, stats = quick_election(
    df=df,
    target_col='target',
    num_clients=4,
    fs_method='lasso',
    freedom_degree=0.3
)

FLARE deployment:

from nvflare.app_opt.feature_election import FeatureElection

fe = FeatureElection(freedom_degree=0.5, fs_method='lasso')
config_paths = fe.create_flare_job(
    job_name="feature_selection",
    output_dir="./jobs"
)

class FeatureElection(freedom_degree: float = 0.5, fs_method: str = 'lasso', aggregation_mode: str = 'weighted', auto_tune: bool = False, tuning_rounds: int = 5, eval_metric: str = 'f1', wait_time_after_min_received: int = 10, fs_params: Dict | None = None)[source]

Bases: object

High-level interface for Feature Election in NVIDIA FLARE. Simplifies integration with tabular datasets for federated feature selection.

This class provides: - Easy data preparation and splitting - Local simulation for testing - Result management and persistence

apply_mask(X: DataFrame | ndarray) → DataFrame | ndarray[source]: Apply global feature mask to new data.

create_flare_job(job_name: str = 'feature_election', output_dir: str = 'jobs/feature_election', min_clients: int = 2, num_rounds: int = 5, client_sites: List[str] | None = None) → Dict[str, str][source]: Generate FLARE job configuration.

load_results(filepath: str)[source]: Load results from JSON.

prepare_data_splits(df: DataFrame, target_col: str, num_clients: int = 3, split_strategy: str = 'stratified', split_ratios: List[float] | None = None, random_state: int = 42, dirichlet_alpha: float = 0.5) → List[Tuple[DataFrame, Series]][source]: Prepare data splits for federated clients.

save_results(filepath: str)[source]: Save results to JSON.

simulate_election(client_data: List[Tuple[DataFrame | ndarray, Series | ndarray]], feature_names: List[str] | None = None) → Dict[source]: Simulate election locally.

class FeatureElectionController(freedom_degree: float = 0.5, aggregation_mode: str = 'weighted', min_clients: int = 2, num_rounds: int = 5, task_name: str = 'feature_election', train_timeout: int = 300, auto_tune: bool = False, tuning_rounds: int = 0, wait_time_after_min_received: int = 10)[source]

Bases: Controller

Three-phase FL controller for federated feature selection and FedAvg training.

Phase 1 — Local Feature Selection: each client runs its configured FS method and returns a feature mask and per-feature scores.

Phase 2 — Tuning & Global Mask Distribution: the server optionally runs hill-climbing to find the optimal freedom_degree, then aggregates client masks via weighted voting and distributes the global feature mask to all clients. If fewer than min_clients clients acknowledge the mask, the entire workflow is aborted.

Phase 3 — FedAvg Training: standard federated averaging on the reduced feature set for num_rounds rounds.

Parameters:

freedom_degree – Threshold in [0, 1] controlling which features survive the vote. 0 = intersection (all clients must select), 1 = union (any client suffices).
aggregation_mode – 'weighted' weights each client by sample count; 'uniform' treats all clients equally.
min_clients – Minimum number of clients that must respond in each phase.
num_rounds – Number of FedAvg training rounds in Phase 3.
task_name – Must match the task_name configured on FeatureElectionExecutor.
train_timeout – Per-phase timeout in seconds.
auto_tune – If True, Phase 2 runs hill-climbing to optimise freedom_degree. Has no effect when tuning_rounds=0 (a warning is logged in that case).
tuning_rounds – Number of hill-climbing iterations. Must be >= 2 for meaningful tuning; tuning_rounds=0 disables tuning (with a warning if auto_tune=True); tuning_rounds=1 is also disabled (same warning).
wait_time_after_min_received – Seconds to wait for additional client responses after min_clients have already replied. Set to 0 only for local simulation; a non-zero value (default 10 s) prevents slower clients from being silently excluded in heterogeneous production networks.

Controller logic for tasks and their destinations.

Must set_communicator() to access communication related function implementations.

Parameters:: task_check_period (float, optional) – interval for checking status of tasks. Applicable for WFCommServer. Defaults to 0.2.

advance_tuning(score: float, first_step: bool = False) → None[source]

Record a tuning-round score and update freedom_degree for the next round.

This is the public interface for the simulation path in FeatureElection.simulate_election() so that the simulation does not need to mutate private controller state directly. The real FL path in control_flow uses the same internal helpers.

Parameters:

score – Weighted evaluation score for the current freedom_degree.
first_step – True only on the very first tuning round; passed through to _calculate_next_fd to seed the initial direction.

aggregate_selections(client_selections: Dict[str, Dict]) → ndarray[source]

Aggregate feature selections from all clients.

Freedom degree controls the blend between intersection and union: - FD=0: Intersection (only features selected by ALL clients) - FD=1: Union (features selected by ANY client) - 0<FD<1: Weighted voting based on scores

control_flow(abort_signal: Signal, fl_ctx: FLContext) → None[source]: Main Orchestration Loop

process_result_of_unknown_task(client: Client, task_name: str, client_task_id: str, result: Shareable, fl_ctx: FLContext)[source]: Called when a result is received for an unknown task. This is a fallback - normally results come through task_done_cb.

start_controller(fl_ctx: FLContext) → None[source]

Starts the controller.

This method is called at the beginning of the RUN.

Parameters:

fl_ctx – the FL context. You can use this context to access services provided by the
example (framework. For)
your (you can get Command Register from it and register)
modules. (admin command)

stop_controller(fl_ctx: FLContext)[source]

Stops the controller.

This method is called right before the RUN is ended.

Parameters:

fl_ctx – the FL context. You can use this context to access services provided by the
example (framework. For)
your (you can get Command Register from it and unregister)
modules. (admin command)

class FeatureElectionExecutor(fs_method: str = 'lasso', fs_params: Dict | None = None, eval_metric: str = 'f1', task_name: str = 'feature_election')[source]

Bases: Executor

Client-side executor for the Feature Election federated workflow.

Handles four request types dispatched by FeatureElectionController:

feature_selection — runs the configured FS method on local data and returns a boolean feature mask and per-feature scores.
tuning_eval — evaluates a candidate mask proposed by the controller during the hill-climbing phase and returns the local score.
apply_mask — permanently slices X_train / X_val to the selected features. Idempotent: if the same mask is received a second time (e.g. due to task retransmission) the call returns OK immediately without modifying data.
train — performs one FedAvg round on the masked feature set and returns the updated model weights.

Parameters:

fs_method – Feature selection algorithm. One of 'lasso', 'elastic_net', 'mutual_info', 'random_forest', 'pyimpetus'.
fs_params – Extra keyword arguments forwarded to the FS algorithm.
eval_metric – 'f1' (weighted) or 'accuracy', used for tuning eval and local scoring.
task_name – Must match the task_name on FeatureElectionController.

Note

Call set_data() before the executor is registered with the FL runtime. FeatureElectionExecutor has no client_id attribute; use fl_ctx.get_identity_name() inside _load_data_if_needed to retrieve the site name assigned by the FL platform.

Init FLComponent.

The FLComponent is the base class of all FL Components. (executors, controllers, responders, filters, aggregators, and widgets are all FLComponents)

FLComponents have the capability to handle and fire events and contain various methods for logging.

evaluate_model(X_train, y_train, X_val, y_val, scaler=None) → float[source]

Helper method to train and evaluate a model locally. Required for the ‘simulate_election’ functionality and tests.

Parameters:: scaler – Optional pre-fitted StandardScaler. When provided the data is transformed (not fit-transformed), ensuring the same normalisation parameters are used as those established on the same feature set by the caller. When None a fresh scaler is fitted on X_train.

execute(task_name: str, shareable: Shareable, fl_ctx: FLContext, abort_signal: Signal) → Shareable[source]

Executes a task.

Parameters:

task_name (str) – task name.
shareable (Shareable) – input shareable.
fl_ctx (FLContext) – fl context.
abort_signal (Signal) – signal to check during execution to determine whether this task is aborted.

Returns:

An output shareable.

perform_feature_selection() → Tuple[ndarray, ndarray][source]

set_data(X_train, y_train, X_val=None, y_val=None, feature_names=None)[source]: Set data for the executor. X_val and y_val are optional; if not provided, training data is used for evaluation.

load_election_results(filepath: str) → Dict[source]: Load election results from a JSON file.

quick_election(df: DataFrame, target_col: str, num_clients: int = 3, freedom_degree: float = 0.5, fs_method: str = 'lasso', split_strategy: str = 'stratified', **kwargs) → Tuple[ndarray, Dict][source]

Quick Feature Election for tabular data (one-line solution).

**kwargs are routed to either FeatureElection or FeatureElection.prepare_data_splits() based on the parameter name. Recognised split parameters: split_ratios, random_state, dirichlet_alpha. All other kwargs are forwarded to FeatureElection (e.g. aggregation_mode, auto_tune, fs_params).