nvflare.app_opt.job_launcher.docker_launcher module
- class ClientDockerJobLauncher(workspace: str | None = None, network: str = 'nvflare-network', python_path: str | None = None, timeout: int = 30, default_job_container_kwargs: dict | None = None, default_job_env: dict | None = None, default_python_path: str | None = None)[source]
Bases:
DockerJobLauncher- Parameters:
workspace – host path to the NVFlare workspace directory. Job containers receive an isolated workspace view: startup/local are mounted read-only, and the current job workspace is mounted read-write at /var/tmp/nvflare/workspace/<job_id>. If not provided, reads from NVFL_DOCKER_WORKSPACE environment variable. Must be the HOST path because it is passed directly to the Docker daemon as a volume bind source.
network – Docker network name. Must already exist.
python_path – Deprecated alias for default_python_path.
timeout – max seconds to wait for container to reach RUNNING state (default 30).
default_job_container_kwargs – site-level default docker run kwargs applied to every job container launched by this site. Job-level resource_spec[site][docker] takes precedence on conflict. Keys use Docker SDK naming (underscores, not hyphens). Example: {“shm_size”: “8g”, “ipc_mode”: “host”} Note: “volumes”, “mounts”, “network”, “environment”, “command”, “name”, “detach”, “user”, “working_dir” are controlled by the launcher and cannot be overridden here.
default_job_env – site-level default environment variables injected into every job container launched by this site. Useful for site/runtime-specific settings such as NCCL workarounds. Launcher-controlled variables like USER, HOME, and PYTHONPATH still take precedence.
default_python_path – Default Python executable path inside job containers. Jobs can override it with launcher_spec[site][“docker”][“python_path”].
- get_module_args(job_args: dict) dict[source]
Return a {flag: value} dict of args to pass to the job module.
- Parameters:
job_args – JOB_PROCESS_ARGS dict from FLContext (with PARENT_URL already overridden).
- Returns:
value} pairs to append after ‘-u -m <module>’ in the container command.
- Return type:
dict of {flag
- class DockerJobHandle(container_id: str, container_name: str, docker_client, timeout: int = 30)[source]
Bases:
JobHandleSpecHandle for a running Docker container job.
Modeled on K8sJobHandle: once the container reaches a terminal state, terminal_state is set and all subsequent poll()/wait() calls return immediately without querying Docker.
- enter_states(states_to_enter: list) bool[source]
Poll until the container enters one of the target states.
Returns True if the target state was reached, False otherwise (timeout, stuck, or terminal state reached before target).
- poll() JobReturnCode[source]
Non-blocking status check. Returns UNKNOWN while still running.
- class DockerJobLauncher(workspace: str | None = None, network: str = 'nvflare-network', python_path: str | None = None, timeout: int = 30, default_job_container_kwargs: dict | None = None, default_job_env: dict | None = None, default_python_path: str | None = None)[source]
Bases:
JobLauncherSpecLaunches NVFlare job processes as Docker containers.
SP/CP runs as a container started by start_docker.sh (site admin). SJ/CJ containers are started dynamically per job by this launcher.
Assumptions: - Docker network already exists (created by start_docker.sh or site admin). - Job containers get an isolated workspace view at /var/tmp/nvflare/workspace:
the root is writable ephemeral tmpfs, startup/local are read-only, and only the current job workspace is read-write and persistent on the host.
SP/CP container name is known and reachable via Docker DNS on the network.
parent_url is derived at runtime from the site name and the port in JOB_PROCESS_ARGS.
- Parameters:
workspace – host path to the NVFlare workspace directory. Job containers receive an isolated workspace view: startup/local are mounted read-only, and the current job workspace is mounted read-write at /var/tmp/nvflare/workspace/<job_id>. If not provided, reads from NVFL_DOCKER_WORKSPACE environment variable. Must be the HOST path because it is passed directly to the Docker daemon as a volume bind source.
network – Docker network name. Must already exist.
python_path – Deprecated alias for default_python_path.
timeout – max seconds to wait for container to reach RUNNING state (default 30).
default_job_container_kwargs – site-level default docker run kwargs applied to every job container launched by this site. Job-level resource_spec[site][docker] takes precedence on conflict. Keys use Docker SDK naming (underscores, not hyphens). Example: {“shm_size”: “8g”, “ipc_mode”: “host”} Note: “volumes”, “mounts”, “network”, “environment”, “command”, “name”, “detach”, “user”, “working_dir” are controlled by the launcher and cannot be overridden here.
default_job_env – site-level default environment variables injected into every job container launched by this site. Useful for site/runtime-specific settings such as NCCL workarounds. Launcher-controlled variables like USER, HOME, and PYTHONPATH still take precedence.
default_python_path – Default Python executable path inside job containers. Jobs can override it with launcher_spec[site][“docker”][“python_path”].
- DEFAULT_PYTHON_PATH = '/usr/local/bin/python'
- STUDY_DATA_PATH_FILE = 'local/study_data.yaml'
- WORKSPACE_MOUNT = '/var/tmp/nvflare/workspace'
- abstract get_module_args(job_args: dict) dict[source]
Return a {flag: value} dict of args to pass to the job module.
- Parameters:
job_args – JOB_PROCESS_ARGS dict from FLContext (with PARENT_URL already overridden).
- Returns:
value} pairs to append after ‘-u -m <module>’ in the container command.
- Return type:
dict of {flag
- handle_event(event_type: str, fl_ctx: FLContext)[source]
Handles events.
- Parameters:
event_type (str) – event type fired by workflow.
fl_ctx (FLContext) – FLContext information.
- launch_job(job_meta: dict, fl_ctx: FLContext) JobHandleSpec[source]
To launch a job run.
- Parameters:
job_meta – job metadata
fl_ctx – FLContext
Returns: a JobHandle instance.
- class DockerStatus[source]
Bases:
object- CREATED = 'created'
- DEAD = 'dead'
- EXITED = 'exited'
- PAUSED = 'paused'
- RESTARTING = 'restarting'
- RUNNING = 'running'
- class ServerDockerJobLauncher(workspace: str | None = None, network: str = 'nvflare-network', python_path: str | None = None, timeout: int = 30, default_job_container_kwargs: dict | None = None, default_job_env: dict | None = None, default_python_path: str | None = None)[source]
Bases:
DockerJobLauncher- Parameters:
workspace – host path to the NVFlare workspace directory. Job containers receive an isolated workspace view: startup/local are mounted read-only, and the current job workspace is mounted read-write at /var/tmp/nvflare/workspace/<job_id>. If not provided, reads from NVFL_DOCKER_WORKSPACE environment variable. Must be the HOST path because it is passed directly to the Docker daemon as a volume bind source.
network – Docker network name. Must already exist.
python_path – Deprecated alias for default_python_path.
timeout – max seconds to wait for container to reach RUNNING state (default 30).
default_job_container_kwargs – site-level default docker run kwargs applied to every job container launched by this site. Job-level resource_spec[site][docker] takes precedence on conflict. Keys use Docker SDK naming (underscores, not hyphens). Example: {“shm_size”: “8g”, “ipc_mode”: “host”} Note: “volumes”, “mounts”, “network”, “environment”, “command”, “name”, “detach”, “user”, “working_dir” are controlled by the launcher and cannot be overridden here.
default_job_env – site-level default environment variables injected into every job container launched by this site. Useful for site/runtime-specific settings such as NCCL workarounds. Launcher-controlled variables like USER, HOME, and PYTHONPATH still take precedence.
default_python_path – Default Python executable path inside job containers. Jobs can override it with launcher_spec[site][“docker”][“python_path”].
- get_module_args(job_args: dict) dict[source]
Return a {flag: value} dict of args to pass to the job module.
- Parameters:
job_args – JOB_PROCESS_ARGS dict from FLContext (with PARENT_URL already overridden).
- Returns:
value} pairs to append after ‘-u -m <module>’ in the container command.
- Return type:
dict of {flag