nvflare.app_opt.job_launcher.docker_launcher module

class ClientDockerJobLauncher(workspace: str | None = None, network: str = 'nvflare-network', python_path: str | None = None, timeout: int = 30, default_job_container_kwargs: dict | None = None, default_job_env: dict | None = None, default_python_path: str | None = None)[source]

Bases: DockerJobLauncher

Parameters:
  • workspace – host path to the NVFlare workspace directory. Job containers receive an isolated workspace view: startup/local are mounted read-only, and the current job workspace is mounted read-write at /var/tmp/nvflare/workspace/<job_id>. If not provided, reads from NVFL_DOCKER_WORKSPACE environment variable. Must be the HOST path because it is passed directly to the Docker daemon as a volume bind source.

  • network – Docker network name. Must already exist.

  • python_path – Deprecated alias for default_python_path.

  • timeout – max seconds to wait for container to reach RUNNING state (default 30).

  • default_job_container_kwargs – site-level default docker run kwargs applied to every job container launched by this site. Job-level resource_spec[site][docker] takes precedence on conflict. Keys use Docker SDK naming (underscores, not hyphens). Example: {“shm_size”: “8g”, “ipc_mode”: “host”} Note: “volumes”, “mounts”, “network”, “environment”, “command”, “name”, “detach”, “user”, “working_dir” are controlled by the launcher and cannot be overridden here.

  • default_job_env – site-level default environment variables injected into every job container launched by this site. Useful for site/runtime-specific settings such as NCCL workarounds. Launcher-controlled variables like USER, HOME, and PYTHONPATH still take precedence.

  • default_python_path – Default Python executable path inside job containers. Jobs can override it with launcher_spec[site][“docker”][“python_path”].

get_module_args(job_args: dict) dict[source]

Return a {flag: value} dict of args to pass to the job module.

Parameters:

job_args – JOB_PROCESS_ARGS dict from FLContext (with PARENT_URL already overridden).

Returns:

value} pairs to append after ‘-u -m <module>’ in the container command.

Return type:

dict of {flag

class DockerJobHandle(container_id: str, container_name: str, docker_client, timeout: int = 30)[source]

Bases: JobHandleSpec

Handle for a running Docker container job.

Modeled on K8sJobHandle: once the container reaches a terminal state, terminal_state is set and all subsequent poll()/wait() calls return immediately without querying Docker.

enter_states(states_to_enter: list) bool[source]

Poll until the container enters one of the target states.

Returns True if the target state was reached, False otherwise (timeout, stuck, or terminal state reached before target).

poll() JobReturnCode[source]

Non-blocking status check. Returns UNKNOWN while still running.

terminate()[source]

Stop and remove the container. Always sets terminal_state.

wait()[source]

Block until the container reaches a terminal state.

class DockerJobLauncher(workspace: str | None = None, network: str = 'nvflare-network', python_path: str | None = None, timeout: int = 30, default_job_container_kwargs: dict | None = None, default_job_env: dict | None = None, default_python_path: str | None = None)[source]

Bases: JobLauncherSpec

Launches NVFlare job processes as Docker containers.

SP/CP runs as a container started by start_docker.sh (site admin). SJ/CJ containers are started dynamically per job by this launcher.

Assumptions: - Docker network already exists (created by start_docker.sh or site admin). - Job containers get an isolated workspace view at /var/tmp/nvflare/workspace:

the root is writable ephemeral tmpfs, startup/local are read-only, and only the current job workspace is read-write and persistent on the host.

  • SP/CP container name is known and reachable via Docker DNS on the network.

  • parent_url is derived at runtime from the site name and the port in JOB_PROCESS_ARGS.

Parameters:
  • workspace – host path to the NVFlare workspace directory. Job containers receive an isolated workspace view: startup/local are mounted read-only, and the current job workspace is mounted read-write at /var/tmp/nvflare/workspace/<job_id>. If not provided, reads from NVFL_DOCKER_WORKSPACE environment variable. Must be the HOST path because it is passed directly to the Docker daemon as a volume bind source.

  • network – Docker network name. Must already exist.

  • python_path – Deprecated alias for default_python_path.

  • timeout – max seconds to wait for container to reach RUNNING state (default 30).

  • default_job_container_kwargs – site-level default docker run kwargs applied to every job container launched by this site. Job-level resource_spec[site][docker] takes precedence on conflict. Keys use Docker SDK naming (underscores, not hyphens). Example: {“shm_size”: “8g”, “ipc_mode”: “host”} Note: “volumes”, “mounts”, “network”, “environment”, “command”, “name”, “detach”, “user”, “working_dir” are controlled by the launcher and cannot be overridden here.

  • default_job_env – site-level default environment variables injected into every job container launched by this site. Useful for site/runtime-specific settings such as NCCL workarounds. Launcher-controlled variables like USER, HOME, and PYTHONPATH still take precedence.

  • default_python_path – Default Python executable path inside job containers. Jobs can override it with launcher_spec[site][“docker”][“python_path”].

DEFAULT_PYTHON_PATH = '/usr/local/bin/python'
STUDY_DATA_PATH_FILE = 'local/study_data.yaml'
WORKSPACE_MOUNT = '/var/tmp/nvflare/workspace'
abstract get_module_args(job_args: dict) dict[source]

Return a {flag: value} dict of args to pass to the job module.

Parameters:

job_args – JOB_PROCESS_ARGS dict from FLContext (with PARENT_URL already overridden).

Returns:

value} pairs to append after ‘-u -m <module>’ in the container command.

Return type:

dict of {flag

handle_event(event_type: str, fl_ctx: FLContext)[source]

Handles events.

Parameters:
  • event_type (str) – event type fired by workflow.

  • fl_ctx (FLContext) – FLContext information.

launch_job(job_meta: dict, fl_ctx: FLContext) JobHandleSpec[source]

To launch a job run.

Parameters:
  • job_meta – job metadata

  • fl_ctx – FLContext

Returns: a JobHandle instance.

class DockerStatus[source]

Bases: object

CREATED = 'created'
DEAD = 'dead'
EXITED = 'exited'
PAUSED = 'paused'
RESTARTING = 'restarting'
RUNNING = 'running'
class ServerDockerJobLauncher(workspace: str | None = None, network: str = 'nvflare-network', python_path: str | None = None, timeout: int = 30, default_job_container_kwargs: dict | None = None, default_job_env: dict | None = None, default_python_path: str | None = None)[source]

Bases: DockerJobLauncher

Parameters:
  • workspace – host path to the NVFlare workspace directory. Job containers receive an isolated workspace view: startup/local are mounted read-only, and the current job workspace is mounted read-write at /var/tmp/nvflare/workspace/<job_id>. If not provided, reads from NVFL_DOCKER_WORKSPACE environment variable. Must be the HOST path because it is passed directly to the Docker daemon as a volume bind source.

  • network – Docker network name. Must already exist.

  • python_path – Deprecated alias for default_python_path.

  • timeout – max seconds to wait for container to reach RUNNING state (default 30).

  • default_job_container_kwargs – site-level default docker run kwargs applied to every job container launched by this site. Job-level resource_spec[site][docker] takes precedence on conflict. Keys use Docker SDK naming (underscores, not hyphens). Example: {“shm_size”: “8g”, “ipc_mode”: “host”} Note: “volumes”, “mounts”, “network”, “environment”, “command”, “name”, “detach”, “user”, “working_dir” are controlled by the launcher and cannot be overridden here.

  • default_job_env – site-level default environment variables injected into every job container launched by this site. Useful for site/runtime-specific settings such as NCCL workarounds. Launcher-controlled variables like USER, HOME, and PYTHONPATH still take precedence.

  • default_python_path – Default Python executable path inside job containers. Jobs can override it with launcher_spec[site][“docker”][“python_path”].

get_module_args(job_args: dict) dict[source]

Return a {flag: value} dict of args to pass to the job module.

Parameters:

job_args – JOB_PROCESS_ARGS dict from FLContext (with PARENT_URL already overridden).

Returns:

value} pairs to append after ‘-u -m <module>’ in the container command.

Return type:

dict of {flag