nvflare.recipe package

Submodules

Module contents

class PocEnv(*, num_clients: int | None = 2, clients: list[str] | None = None, gpu_ids: list[int] | None = None, use_he: bool = False, docker_image: str | None = None, project_conf_path: str = '', username: str = 'admin@nvidia.com', extra: dict | None = None)[source]

Bases: ExecEnv

Proof of Concept execution environment for local testing and development.

This environment sets up a POC deployment on a single machine with multiple processes representing the server, clients, and admin console.

Initialize POC execution environment.

Parameters:
  • num_clients (int, optional) – Number of clients to use in POC mode. Defaults to 2.

  • clients (list[str], optional) – List of client names. If None, will generate site-1, site-2, etc. Defaults to None. If specified, number_of_clients argument will be ignored.

  • gpu_ids (list[int], optional) – List of GPU IDs to assign to clients. If None, uses CPU only. Defaults to None.

  • use_he (bool, optional) – Whether to use HE. Defaults to False.

  • docker_image (str, optional) – Docker image to use for POC. Defaults to None.

  • project_conf_path (str, optional) – Path to the project configuration file. Defaults to “”. If specified, ‘number_of_clients’,’clients’ and ‘docker’ specific options will be ignored.

  • username (str, optional) – Admin user. Defaults to “admin@nvidia.com”.

  • extra – extra env info.

abort_job(job_id: str) None[source]

Abort a running job.

Parameters:

job_id – The job ID to abort.

deploy(job: FedJob)[source]

Deploy a FedJob to the POC environment.

Parameters:

job (FedJob) – The FedJob to deploy.

Returns:

Job ID or deployment result.

Return type:

str

get_job_result(job_id: str, timeout: float = 0.0) str | None[source]

Get the result workspace of a job.

Parameters:
  • job_id – The job ID to get results for.

  • timeout – The timeout for the job to complete. Defaults to 0.0 (no timeout).

Returns:

The result workspace path if job completed, None if still running or stopped early.

Return type:

Optional[str]

get_job_status(job_id: str) str | None[source]

Get the status of a job.

Parameters:

job_id – The job ID to check status for.

Returns:

The status of the job, or None if not supported.

Return type:

Optional[str]

stop(clean_poc: bool = False)[source]

Try to stop and clean existing POC.

Parameters:

clean_poc (bool, optional) – Whether to clean the POC workspace. Defaults to False.

class ProdEnv(startup_kit_location: str, login_timeout: float = 5.0, username: str = 'admin@nvidia.com', extra: dict | None = None)[source]

Bases: ExecEnv

Production execution environment for submitting and monitoring NVFlare jobs.

This environment uses the startup kit of an NVFlare deployment to submit jobs via the Flare API.

Parameters:
  • startup_kit_location (str) – Path to the admin’s startup kit directory.

  • login_timeout (float) – Timeout (in seconds) for logging into the Flare API session. Must be > 0.

  • username (str) – Username to log in with.

  • extra – extra env info.

abort_job(job_id: str) None[source]

Abort a running job.

Parameters:

job_id – The job ID to abort.

deploy(job: FedJob)[source]

Deploy a job using SessionManager.

get_job_result(job_id: str, timeout: float = 0.0) str | None[source]

Get the result workspace of a job.

Parameters:
  • job_id – The job ID to get results for.

  • timeout – The timeout for the job to complete. Defaults to 0.0 (no timeout).

Returns:

The result workspace path if job completed, None if still running or stopped early.

Return type:

Optional[str]

get_job_status(job_id: str) str | None[source]

Get the status of a job.

Parameters:

job_id – The job ID to check status for.

Returns:

The status of the job, or None if not supported.

Return type:

Optional[str]

class Run(exec_env: ExecEnv, job_id: str)[source]

Bases: object

abort()[source]

Abort the running job.

get_job_id() str[source]
get_result(timeout: float = 0.0) str | None[source]

Get the result workspace of the run.

Parameters:

timeout (float, optional) – The timeout for the job to complete. Defaults to 0.0, means never timeout.

Returns:

The result workspace path if job completed, None if still running or stopped early.

Return type:

Optional[str]

get_status() str | None[source]

Get the status of the run.

Returns:

The status of the run, or None if called in a simulation environment.

Return type:

Optional[str]

class SimEnv(*, num_clients: int = 0, clients: list[str] | None = None, num_threads: int | None = None, gpu_config: str | None = None, log_config: str | None = None, workspace_root: str = '/tmp/nvflare/simulation', extra: dict | None = None)[source]

Bases: ExecEnv

Initialize simulation execution environment.

Parameters:
  • num_clients (int, optional) – Number of simulated clients. Defaults to 0.

  • clients (list[str], optional) – List of client names. Defaults to None.

  • num_threads (int, optional) – Number of threads to run simulator. Defaults to None. If not provided, the number of threads will be set to the number of clients.

  • gpu_config (str, optional) – GPU configuration string. Defaults to None.

  • log_config (str, optional) – Log configuration string. Defaults to None.

  • workspace_root (str, optional) – Root directory for simulation workspace. Defaults to WORKSPACE_ROOT.

  • extra – extra env config info

abort_job(job_id: str) None[source]

Abort job - not supported in simulation environment.

deploy(job: FedJob)[source]

Deploy a FedJob and return an execution response.

Parameters:

job – The FedJob to deploy.

Returns:

The job ID.

Return type:

str

get_job_result(job_id: str, timeout: float = 0.0) str | None[source]

Get job result workspace path.

get_job_status(job_id: str) str | None[source]

Get job status - not supported in simulation environment.

add_experiment_tracking(recipe: Recipe, tracking_type: str, tracking_config: dict | None = None)[source]

Enable experiment tracking.

Parameters:
  • tracking_type – the type of tracking to enable

  • tracking_config – the configuration for the tracking