.. _helm_chart:

###########################
Running FLARE in Kubernetes
###########################

.. contents::
   :local:
   :depth: 2

NVIDIA FLARE can be deployed to Kubernetes by first provisioning normal startup
kits and then preparing each server or client kit for the Kubernetes runtime.
The prepared kit contains a participant-specific Helm chart plus the
``startup/`` and ``local/`` folders that must be staged into Kubernetes storage.

For example scripts that automate temporary Kubernetes and managed cloud cluster
testing flows, see
:github_nvflare_link:`examples/devops <examples/devops>`. These scripts are
for development, smoke testing, demos, and learning only; they are not
production deployment guidance.

Prerequisites
=============

Before you start, make sure you have:

* ``nvflare`` installed on the workstation where you provision and run
  ``nvflare deploy prepare``.
* ``kubectl`` configured for the target cluster. Use a ``kubectl`` version that
  is compatible with the Kubernetes API server.
* Helm 3.
* A Kubernetes cluster with standard ``apps/v1`` Deployment,
  ``rbac.authorization.k8s.io/v1`` Role/RoleBinding, Service, Secret, and PVC
  support.
* An actively supported Kubernetes release. The generated chart uses stable
  Kubernetes APIs and does not depend on provider-specific extensions.
* A default ``StorageClass`` or an explicit ``storageClassName`` for every PVC.
  Check with ``kubectl get storageclass``.
* A container registry that every server and client cluster can pull from.
* NVIDIA GPU Operator or NVIDIA device plugin installed on clusters that will
  run jobs with ``resource_spec[site].num_of_gpus``. See
  `Cloud GPU Setup References`_.
* For Kubernetes job launching, a Kubernetes API-server CA chain that passes
  Python 3.13+ strict X.509 validation. CA certificates must include required
  RFC 5280 extensions such as ``keyUsage`` with certificate signing allowed.

The generated charts do not install a Kubernetes cluster, storage class, GPU
device plugin, ingress controller, or registry credentials.

Cloud GPU Setup References
--------------------------

Managed Kubernetes services differ in how they handle GPU drivers, the NVIDIA
Container Toolkit, the NVIDIA GPU Operator, and the NVIDIA Kubernetes device
plugin. Before running GPU jobs, verify that GPU nodes advertise allocatable
``nvidia.com/gpu`` resources.

Use the current provider documentation for your cluster:

* Amazon Elastic Kubernetes Service (EKS): `Manage NVIDIA GPU devices on Amazon
  EKS <https://docs.aws.amazon.com/eks/latest/userguide/device-management-nvidia.html>`__
  and `GPU support in eksctl
  <https://docs.aws.amazon.com/eks/latest/eksctl/gpu-support.html>`__.
* Google Kubernetes Engine (GKE): `Manage the GPU Stack with the NVIDIA GPU
  Operator on GKE
  <https://cloud.google.com/kubernetes-engine/docs/how-to/gpu-operator>`__ and
  `About GPUs in GKE
  <https://cloud.google.com/kubernetes-engine/docs/concepts/gpus>`__.
* Azure Kubernetes Service (AKS): `Use GPUs on AKS
  <https://learn.microsoft.com/en-us/azure/aks/use-nvidia-gpu>`__ and `NVIDIA
  GPU Operator with AKS
  <https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/microsoft-aks.html>`__.
* NVIDIA: `NVIDIA GPU Operator
  <https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/>`__.

Kubernetes Runtime Model
========================

Kubernetes deployment has two runtime layers:

* A **parent pod** runs the long-lived FLARE server or client process. Helm
  installs this pod from the per-participant ``helm_chart/`` generated by
  ``nvflare deploy prepare``. The parent pod mounts the configured workspace PVC
  at ``parent.workspace_mount_path`` and reads ``startup/`` and ``local/`` from
  that PVC. Its Python executable is set by ``parent.python_path`` or, when
  omitted, defaults to ``/usr/local/bin/python3``.
* A **job pod** is created dynamically by ``ServerK8sJobLauncher`` or
  ``ClientK8sJobLauncher`` for each submitted job. Job pod image, Python path,
  CPU, memory, GPU, and ephemeral storage settings come from the submitted
  job's ``launcher_spec`` and from the ``job_launcher`` defaults in
  ``k8s.yaml``.

The generated Helm chart does not run submitted jobs directly. It installs the
parent participant process, its Kubernetes Service, its ServiceAccount, and the
Role/RoleBinding that allow the launcher to create job pods.

The parent Service is the stable in-cluster address for dynamically launched job
pods. ``nvflare deploy prepare`` patches the prepared kit's internal
communication settings to use the generated Service name and ``parent_port``.
``parent_port`` is the parent-process port used by job pods for internal
parent/job communication; it is not the federated learning port that remote
clients use to reach the server. If you rename or replace the Service, keep the
Service name, Service port, and prepared kit communication settings consistent.

The runtime shape is:

.. code-block:: text

   admin console
        |
        | FL/admin traffic to server fed_learn_port/admin_port
        v
   server cluster or namespace
     server parent pod
       | mounts workspace PVC: startup/, local/, transfer/
       | launches server job pods through Kubernetes API
       v
     server job pod emptyDir workspace
       | optional mounts: /data/<study>/<dataset> from study-data PVCs
       | workspace transfer over parent Service on parent_port

   client cluster or namespace
     client parent pod
       | outbound FL connection to server fed_learn_port
       | mounts workspace PVC: startup/, local/
       | launches client job pods through Kubernetes API
       v
     client job pod emptyDir workspace
       | optional mounts: /data/<study>/<dataset> from study-data PVCs
       | workspace transfer over client parent Service on parent_port

Server and client participants may run in the same Kubernetes cluster or in
separate clusters. Separate clusters are common because each site controls its
own compute and data. If participants run in separate clusters, using the same
namespace and PVC names in each cluster is safe. If multiple participants run in
one cluster, give each participant its own namespace or its own workspace PVC;
do not point a server and a client at the same workspace PVC because their
``startup/`` and ``local/`` contents are different.

Client sites need outbound network access to the server endpoint configured
during provisioning, usually ``<server-host>:<fed_learn_port>``. A client site
does not need an inbound FL port or an externally exposed Service. The client
chart creates an in-cluster Service only so that dynamically launched client job
pods can reach their client parent pod.

Each prepared participant folder contains its own chart:

.. code-block:: text

   server-k8s/
     helm_chart/
     local/
     startup/
     transfer/

   site-1-k8s/
     helm_chart/
     local/
     startup/
     transfer/

The ``transfer/`` directory is the normal FLARE admin file-transfer directory.
For the server, it is used under the mounted workspace when admin storage is
configured as ``transfer``. It is not the Kubernetes job workspace-transfer
mechanism and job pods do not mount it. Stage or create it on the server
workspace PVC when you stage ``startup/`` and ``local/``.

Build and Push the FLARE Image
==============================

The Helm charts need a FLARE runtime image that every participating cluster can
pull. For the image build and registry-push workflow, see
:ref:`brev_build_push_flare_image`.

NVIDIA publishes an official NVFlare Docker image in the NGC container registry
at ``nvcr.io``. Use a tag that matches the NVFlare version used to provision and
prepare the startup kits, and set that image in ``parent.docker_image`` in
``k8s.yaml``.

Users can also build their own parent runtime image from this repository by
modifying ``docker/Dockerfile.parent`` and pushing the result to a registry that
all participating clusters can pull from. Keep the NVFlare ``K8S`` extra, or
install the Kubernetes Python client explicitly, so the parent server or client
can create job pods.

The parent image comes from ``parent.docker_image`` in ``k8s.yaml`` and is
rendered into ``helm_chart/values.yaml``. Submitted jobs must also specify a job
image in ``meta.json`` under ``launcher_spec[site][k8s].image`` or
``launcher_spec.default.k8s.image``. The parent image and job image can be the
same image, but they do not have to be.

Prepare Startup Kits
====================

The provisioning step remains responsible for identity material, certificates,
server host names, FL ports, and FLARE configuration:

.. code-block:: bash

   nvflare provision -p project.yml -w workspace

The server ``default_host`` and ``host_names`` in ``project.yml`` must match the
external endpoint that clients and admin consoles will use to reach the server.
If those values change, reprovision and rerun ``nvflare deploy prepare``.

After provisioning, prepare each server or client startup kit with
``nvflare deploy prepare``:

.. code-block:: bash

   nvflare deploy prepare workspace/<project>/prod_00/server \
       --output server-k8s \
       --config k8s.yaml

   nvflare deploy prepare workspace/<project>/prod_00/site-1 \
       --output site-1-k8s \
       --config k8s.yaml

Example ``k8s.yaml``:

.. code-block:: yaml

   runtime: k8s
   namespace: nvflare
   parent:
     docker_image: registry.example.com/nvflare:dev
     image_pull_secrets:
       - registry-credentials
     parent_port: 8102
     workspace_pvc: nvflws
     workspace_mount_path: /var/tmp/nvflare/workspace
     python_path: /usr/local/bin/python3
     resources:
       requests:
         cpu: "2"
         memory: 8Gi
   job_launcher:
     config_file_path:
     default_python_path: /usr/local/bin/python3
     image_pull_secrets:
       - job-registry-credentials
     pending_timeout: 300

The runtime config controls site-level Kubernetes settings:

* ``namespace`` is where the parent pod and dynamically launched job pods run.
* ``server_service_name`` sets the FL server Kubernetes Service name. It
  defaults to ``nvflare-server``.
* ``parent`` values are rendered into the Helm chart. They set the parent image,
  Python executable, workspace PVC, parent service port, parent pod resources,
  optional parent pod security context, and optional image pull Secret
  references. ``parent.image_pull_secrets`` must name Kubernetes Secrets that
  already exist in the target namespace; NVFLARE does not create registry
  credentials. This setting applies to the generated parent pod chart; use
  ``job_launcher.image_pull_secrets`` for dynamically launched job pods.
  ``parent.python_path`` controls the long-lived SP/CP parent pod command.
  ``parent.workspace_mount_path`` is also written into the K8s launcher config
  so spawned SJ/CJ job pods mount their job workspace and startup kit at the
  same in-container path.
* ``job_launcher`` values are written into the participant's
  ``local/resources.json.default`` so the parent process can create job pods.
  ``config_file_path`` may be empty for in-cluster configuration, and
  ``default_python_path`` controls SJ/CJ job pods when a job does not override
  ``launcher_spec[site][k8s].python_path``. It does not control the SP/CP parent
  pod Python path; use ``parent.python_path`` for that command.
  ``image_pull_secrets`` names existing Kubernetes image pull Secrets attached
  to every dynamically launched job pod for this prepared site. Configure this
  during deployment preparation when job images live in a private registry; job
  authors still only specify the job image in ``meta.json``.
  ``pending_timeout`` is in seconds. It controls how long a dynamically launched
  job pod can stay in ``Pending`` or ``Unknown`` before the launcher deletes it
  and reports the run as an execution exception. The admin ``list_jobs`` command
  then shows ``FINISHED:EXECUTION_EXCEPTION`` instead of treating the timeout as
  a user abort.

The parent pod and job pods use different Python settings:

.. list-table::
   :header-rows: 1

   * - Setting
     - Applies to
     - Notes
   * - ``parent.python_path``
     - Parent server or client pod
     - Rendered as the Helm container command for ``server_train`` or
       ``client_train``.
   * - ``job_launcher.default_python_path``
     - Dynamically launched job pods
     - Used when a job does not set
       ``launcher_spec[site][k8s].python_path``.
   * - ``launcher_spec[site][k8s].python_path``
     - Dynamically launched job pods
     - Per-job override in ``meta.json``.

Prepare Cluster Storage
=======================

Create and bind any workspace or study-data PVCs required by your cluster before
starting the participant.

Create the namespace before applying namespaced PVC manifests or installing
the Helm chart:

.. code-block:: bash

   export NAMESPACE=nvflare
   kubectl create namespace "$NAMESPACE" --dry-run=client -o yaml | kubectl apply -f -

Workspace PVC
-------------

The workspace PVC is for the parent server or client pod. The generated chart
mounts ``parent.workspace_pvc`` at ``parent.workspace_mount_path``, but it does
not upload files to the PVC. Copy the prepared kit's ``startup/`` and
``local/`` directories into the root of that workspace PVC before installing the
chart. For server kits, also create or copy ``transfer/`` at the workspace root
for admin file-transfer storage.

Example ``workspace-pvc.yaml``:

.. code-block:: yaml

   apiVersion: v1
   kind: PersistentVolumeClaim
   metadata:
     name: nvflws
   spec:
     accessModes:
       - ReadWriteOnce
     resources:
       requests:
         storage: 10Gi
     # If your cluster has no default StorageClass, uncomment and set this.
     # storageClassName: <storage-class-name>

Use a larger size if the server's job history, snapshots, or logs need more
space. Use a distinct workspace claim per participant when multiple
participants run in the same namespace.

For example, with a prepared folder named ``server-k8s`` and a workspace PVC
named ``nvflws``:

.. code-block:: bash

   export NAMESPACE=nvflare
   export PREPARED_KIT=server-k8s
   export WORKSPACE_PVC=nvflws

   kubectl create namespace "$NAMESPACE" --dry-run=client -o yaml | kubectl apply -f -
   kubectl -n "$NAMESPACE" apply -f workspace-pvc.yaml
   kubectl -n "$NAMESPACE" get pvc "$WORKSPACE_PVC"

   kubectl -n "$NAMESPACE" delete pod nvflare-pvc-copy --ignore-not-found=true
   cat >/tmp/nvflare-pvc-copy.json <<EOF
   {
     "spec": {
       "restartPolicy": "Never",
       "volumes": [
         {"name": "ws", "persistentVolumeClaim": {"claimName": "${WORKSPACE_PVC}"}}
       ],
       "containers": [
         {
           "name": "nvflare-pvc-copy",
           "image": "busybox:1.36",
           "command": ["sleep", "600"],
           "volumeMounts": [{"name": "ws", "mountPath": "/mnt/nvflws"}]
         }
       ]
     }
   }
   EOF
   kubectl -n "$NAMESPACE" run nvflare-pvc-copy \
       --image=busybox:1.36 \
       --restart=Never \
       --overrides="$(cat /tmp/nvflare-pvc-copy.json)"
   kubectl -n "$NAMESPACE" wait --for=condition=Ready pod/nvflare-pvc-copy --timeout=120s
   kubectl -n "$NAMESPACE" exec nvflare-pvc-copy -- rm -rf /mnt/nvflws/startup /mnt/nvflws/local
   kubectl -n "$NAMESPACE" cp "$PREPARED_KIT/startup" nvflare-pvc-copy:/mnt/nvflws/startup
   kubectl -n "$NAMESPACE" cp "$PREPARED_KIT/local" nvflare-pvc-copy:/mnt/nvflws/local
   kubectl -n "$NAMESPACE" exec nvflare-pvc-copy -- mkdir -p /mnt/nvflws/transfer
   kubectl -n "$NAMESPACE" exec nvflare-pvc-copy -- ls -la /mnt/nvflws
   kubectl -n "$NAMESPACE" delete pod nvflare-pvc-copy

The PVC root must contain ``startup/`` and ``local/`` directly. At runtime,
those folders appear under the configured workspace mount path
(``parent.workspace_mount_path``, rendered as
``persistence.workspace.mountPath``). With the example default, the parent
expects ``/var/tmp/nvflare/workspace/startup`` and
``/var/tmp/nvflare/workspace/local``. If the PVC root contains a nested
``server-k8s/`` or ``site-1-k8s/`` folder instead, the parent pod will not find
those folders under the configured mount path.

The dynamically launched job pod does **not** mount this workspace PVC. Each job
pod receives its own writable ``emptyDir`` mounted at the configured workspace
mount path. The launcher transfers the needed ``local/`` and job workspace
content into that ``emptyDir`` when the pod starts and uploads the job results
back to the parent process when the job exits. The job pod workspace size is
controlled by
``launcher_spec[site][k8s].ephemeral_storage`` when set, or by the launcher
default otherwise. The same value is also used for the container
``ephemeral-storage`` request and limit.

Study Data PVC
--------------

Study data PVCs are separate from the parent workspace PVC. Configure optional
study data mappings in ``local/study_data.yaml`` inside the prepared kit before
copying ``local/`` into the workspace PVC. If the kit is already staged, edit
the file on the PVC or restage ``local/``.

Example ``study_data.yaml``:

.. code-block:: yaml

   default:
     data:
       source: nvfldata
       mode: ro

For Kubernetes, each ``source`` value is a PVC claim name. The job pod mounts
the dataset at ``/data/<study>/<dataset>``, for example
``/data/default/data``. ``mode`` must be ``ro`` or ``rw``. Missing
``study_data.yaml`` files or missing entries for a job's study mean no
study-data PVCs are mounted for that job.

Example ``nvfldata-pvc.yaml``:

.. code-block:: yaml

   apiVersion: v1
   kind: PersistentVolumeClaim
   metadata:
     name: nvfldata
   spec:
     accessModes:
       - ReadWriteOnce
     resources:
       requests:
         storage: 50Gi
     # If your cluster has no default StorageClass, uncomment and set this.
     # storageClassName: <storage-class-name>

Use an access mode supported by your storage backend. ``ReadWriteOnce`` is
enough for many single-node or single-job cases. Use ``ReadOnlyMany`` or
``ReadWriteMany`` storage, or separate per-site claims, when multiple job pods
on different nodes need concurrent access to the same dataset.

Apply the study-data PVC in the same namespace where the participant's job pods
will run:

.. code-block:: bash

   kubectl -n "$NAMESPACE" apply -f nvfldata-pvc.yaml
   kubectl -n "$NAMESPACE" get pvc nvfldata

Install the Charts
==================

Prepare, stage, and install each server or client kit in the Kubernetes cluster
or namespace where that participant runs.

Install the server chart:

.. code-block:: bash

   export NAMESPACE=nvflare

   helm upgrade --install server server-k8s/helm_chart \
       --namespace "$NAMESPACE"

Install a client chart with the same pattern:

.. code-block:: bash

   helm upgrade --install site-1 site-1-k8s/helm_chart \
       --namespace "$NAMESPACE"

``nvflare deploy prepare`` writes ``image.repository`` and ``image.tag`` into
``helm_chart/values.yaml`` from ``parent.docker_image`` in ``k8s.yaml``. For a
different parent image, rerun ``nvflare deploy prepare`` with the updated
``k8s.yaml``. If you must override the image at Helm install or upgrade time,
prefer a values file and pass it to every related ``helm upgrade`` command:

.. code-block:: bash

   cat > server-values.yaml <<'EOF'
   image:
     repository: registry.example.com/nvflare
     tag: dev
   EOF

   helm upgrade --install server server-k8s/helm_chart \
       --namespace "$NAMESPACE" \
       -f server-values.yaml

Avoid using one-off ``--set image.repository=...`` and ``--set image.tag=...``
flags as the source of truth for image changes. Later upgrade commands that do
not include the same overrides can render the release with the generated chart
defaults instead.

If the server and client run in the same namespace, use different workspace PVCs
or override ``persistence.workspace.claimName`` for one of the releases:

.. code-block:: bash

   helm upgrade --install site-1 site-1-k8s/helm_chart \
       --namespace "$NAMESPACE" \
       --set persistence.workspace.claimName=nvflws-site-1

The namespace must already exist before you run namespaced ``kubectl`` commands
or install the charts. The storage step above creates it explicitly. If you skip
that flow, create the namespace first:

.. code-block:: bash

   kubectl create namespace "$NAMESPACE" --dry-run=client -o yaml | kubectl apply -f -

Expose FL Traffic
=================

The generated server chart creates a Kubernetes Service for the FL server. The
service defaults to ``ClusterIP``, which is reachable only inside the cluster.
If clients or admin consoles connect from outside the cluster, expose the FL
server ports with the mechanism that matches your Kubernetes environment:

If you use an override values file for the server release, include the same
``-f`` file in these ``helm upgrade`` commands too.

* Use a cloud load balancer when available:

  .. code-block:: bash

     helm upgrade --install server server-k8s/helm_chart \
         --namespace "$NAMESPACE" \
         --set service.type=LoadBalancer
     kubectl -n "$NAMESPACE" get svc nvflare-server

* For local testing from the same machine, use port forwarding:

  .. code-block:: bash

     kubectl -n "$NAMESPACE" port-forward svc/nvflare-server 8002:8002 8003:8003

* For single-node or ingress-based clusters, configure your cluster's TCP
  routing, firewall rules, or host ports so the FL and admin ports from
  ``project.yml`` reach the ``nvflare-server`` Service. Some single-node
  deployments use ``--set hostPortEnabled=true`` for the server chart.

Make sure the server host name used during provisioning resolves to the exposed
address. For example, update DNS or ``/etc/hosts`` for the admin console and for
any remote client sites.

Verify The Deployment
=====================

After installing a chart, verify that the deployment, pods, services, and PVCs
are healthy:

.. code-block:: bash

   kubectl -n "$NAMESPACE" rollout status deployment/server --timeout=300s
   kubectl -n "$NAMESPACE" rollout status deployment/site-1 --timeout=300s
   kubectl -n "$NAMESPACE" get pods,svc,pvc
   kubectl -n "$NAMESPACE" logs deploy/server --tail=200
   kubectl -n "$NAMESPACE" logs deploy/site-1 --tail=200

If a pod is not ready, inspect the pod and recent events:

.. code-block:: bash

   kubectl -n "$NAMESPACE" describe pod -l app.kubernetes.io/instance=server
   kubectl -n "$NAMESPACE" get events --sort-by=.lastTimestamp

Pod logs persist only while the pod exists. When a parent pod restarts or is
recreated by a Helm upgrade, prior logs are lost. Use cluster log aggregation
or capture logs externally if you need to retain them.

Login With The Admin Console
============================

Use the admin startup kit produced by ``nvflare provision``. The admin console
connects to the server host and ports written into the provisioned project, so
confirm that those names resolve to the exposed Kubernetes endpoint before
logging in.

.. code-block:: bash

   cd workspace/<project>/prod_00/admin@nvidia.com/startup
   bash fl_admin.sh

When prompted for ``User Name``, enter the admin identity from ``project.yml``,
such as ``admin@nvidia.com``.

Private Registry and Image Pull Secrets
=======================================

The generated chart supports parent-pod image pull Secrets through
``imagePullSecrets`` in ``helm_chart/values.yaml``. ``nvflare deploy prepare``
fills this value from ``parent.image_pull_secrets`` in ``k8s.yaml``. The
Kubernetes Secrets must already exist in the participant namespace; NVFLARE does
not create registry credentials.

For example:

.. code-block:: bash

   kubectl -n "$NAMESPACE" create secret docker-registry registry-credentials \
       --docker-server=registry.example.com \
       --docker-username="$REGISTRY_USERNAME" \
       --docker-password="$REGISTRY_PASSWORD"

.. code-block:: yaml

   parent:
     docker_image: registry.example.com/nvflare:dev
     image_pull_secrets:
       - registry-credentials

This renders the parent chart value as:

.. code-block:: yaml

   imagePullSecrets:
     - name: registry-credentials

Dynamically launched job pods are not controlled by the Helm chart after
installation. For private job images, set ``job_launcher.image_pull_secrets`` in
``k8s.yaml`` before running ``nvflare deploy prepare``. The K8s launcher writes
those Secret references into each created job pod's ``spec.imagePullSecrets``.

If your cluster supports node-level registry credentials or the namespace
default ServiceAccount already has suitable image pull Secrets, you can use that
instead of explicit ``image_pull_secrets`` settings.

If a parent pod or job pod enters ``ImagePullBackOff``, inspect the pod events
with ``kubectl describe pod`` and confirm that the image name, tag, registry
credentials, and image pull policy are correct.

Helm Values Reference
=====================

``nvflare deploy prepare`` writes each participant's generated defaults to
``helm_chart/values.yaml``. The most commonly overridden values are image,
service exposure, resources, and persistence.

.. list-table::
   :header-rows: 1

   * - Value
     - Scope
     - Default source
     - Purpose
   * - ``name``
     - Server and client
     - Participant name
     - Deployment name and chart labels unless chart helpers derive another
       name.
   * - ``siteName``
     - Client
     - Participant name
     - Client UID passed to ``client_train``.
   * - ``serviceName``
     - Server and client
     - ``server_service_name`` for server; stable site name for client
     - Kubernetes Service name used by job pods to reach the parent pod.
   * - ``image.repository``
     - Server and client
     - Repository part of ``parent.docker_image``
     - Parent pod image repository.
   * - ``image.tag``
     - Server and client
     - Tag part of ``parent.docker_image``
     - Parent pod image tag. If empty, the repository value is used as-is.
   * - ``image.pullPolicy``
     - Server and client
     - ``IfNotPresent`` for server, ``Always`` for client
     - Parent pod image pull policy.
   * - ``imagePullSecrets``
     - Server and client
     - ``parent.image_pull_secrets`` rendered as ``[{name: ...}]``
     - Parent pod image pull Secret references. The Secrets must already exist
       in the release namespace.
   * - ``serviceAccount.create``
     - Server and client
     - ``true``
     - Creates a ServiceAccount for the parent pod.
   * - ``serviceAccount.annotations``
     - Server and client
     - ``{}``
     - Adds annotations to the generated ServiceAccount.
   * - ``serviceAccount.automountServiceAccountToken``
     - Server and client
     - ``true``
     - Must remain enabled when the parent launcher uses in-cluster
       Kubernetes API access.
   * - ``rbac.create``
     - Server and client
     - ``true``
     - Creates the Role and RoleBinding needed to create job pods and startup
       Secrets.
   * - ``podAnnotations``
     - Server and client
     - ``{}``
     - Adds annotations to the parent pod template.
   * - ``securityContext``
     - Server and client
     - ``parent.pod_security_context`` or ``{}``
     - Parent pod security context.
   * - ``resources``
     - Server and client
     - ``parent.resources`` or CPU ``2`` and memory ``8Gi`` requests
     - Parent pod resource requests and limits.
   * - ``persistence.workspace.claimName``
     - Server and client
     - ``parent.workspace_pvc`` or ``nvflws``
     - Workspace PVC mounted by the parent pod.
   * - ``persistence.workspace.volumeName``
     - Server and client
     - ``workspace``
     - Internal volume name in the parent pod manifest.
   * - ``persistence.workspace.mountPath``
     - Server and client
     - ``parent.workspace_mount_path``
     - In-container workspace mount path.
   * - ``fedLearnPort``
     - Server
     - Server ``fed_learn_port`` from provisioning, or ``8002``
     - FL server port exposed by the server Service and parent container.
   * - ``adminPort``
     - Server
     - Server ``admin_port`` when distinct from ``fedLearnPort``; otherwise
       ``null``
     - Admin port exposed by the server Service and parent container.
   * - ``parentPort``
     - Server
     - ``parent.parent_port`` or ``8102``
     - Internal parent Service port for server job pods.
   * - ``port``
     - Client
     - ``parent.parent_port`` or ``8102``
     - Internal parent Service port for client job pods.
   * - ``hostPortEnabled``
     - Server
     - ``false``
     - Adds ``hostPort`` for ``fedLearnPort`` and ``adminPort`` on the server
       parent pod. Useful for some single-node clusters.
   * - ``tcpConfigMapEnabled``
     - Server
     - ``false``
     - Emits a MicroK8s nginx ingress TCP-services ConfigMap mapping the FL
       ports to the server Service. Useful only on MicroK8s clusters that use
       the nginx ingress addon.
   * - ``service.type``
     - Server
     - ``ClusterIP``
     - Server Service type, for example ``LoadBalancer``.
   * - ``service.loadBalancerIP``
     - Server
     - ``null``
     - Optional static load-balancer IP when supported by the cluster.
   * - ``service.annotations``
     - Server and client
     - ``{}``
     - Adds annotations to the generated Service.
   * - ``command``
     - Server and client
     - ``parent.python_path``
     - Parent container command.
   * - ``args``
     - Server and client
     - Generated by ``nvflare deploy prepare``
     - Parent process module and runtime arguments. Override only when you know
       how the FLARE parent process is launched.

Launcher RBAC
=============

The generated chart creates a ServiceAccount and namespace-scoped
Role/RoleBinding by default. The launcher needs permission to:

* create, delete, get, list, and watch pods;
* create, get, and update Secrets.

The Secret permission is required because the launcher creates or updates a
per-site startup-kit Secret for dynamically launched job pods. Job pods mount
that Secret read-only at ``<workspace_mount_path>/startup``. The Secret name
uses this pattern:

.. code-block:: text

   nvflare-startup-<rfc1123-site-name>-<8-char-sha256-prefix>

``<rfc1123-site-name>`` is the site name with non-RFC1123 characters replaced.
The 8-char SHA256 suffix is always appended, even for site names that are
already RFC1123-compliant, so look up the Secret name with the ``grep`` example
below rather than constructing it. Service and Deployment names, in contrast,
track the site name directly when it is already DNS-label compliant (lowercase
alphanumeric and hyphens, starting and ending with alphanumeric, up to 63
characters).

For example, inspect startup-kit Secrets with:

.. code-block:: bash

   kubectl -n "$NAMESPACE" get secret | grep nvflare-startup

If your cluster operator disables ``serviceAccount.create`` or ``rbac.create``
in chart values, provide equivalent API access in the same namespace before job
submission. The parent pod must run with a ServiceAccount that can create job
pods and create or update startup-kit Secrets.

Configure Kubernetes Job Pods
=============================

Job pod settings live in the submitted job's ``meta.json`` under
``launcher_spec``. The ``default`` block applies to all sites and a site-specific
block overrides it:

.. code-block:: json

   {
     "launcher_spec": {
       "default": {
         "k8s": {
           "image": "registry.example.com/nvflare-job:latest",
           "python_path": "/usr/local/bin/python3",
           "cpu": "2",
           "memory": "8Gi",
           "ephemeral_storage": "8Gi"
         }
       },
       "site-1": {
         "k8s": {
           "image": "registry.example.com/site-1-job:latest",
           "cpu_request": "1",
           "memory_request": "4Gi"
         }
       }
     },
     "resource_spec": {
       "site-1": {
         "num_of_gpus": 1
       }
     }
   }

Supported ``launcher_spec[site][k8s]`` keys include:

* ``image``: container image for the job pod. This is required, either in
  ``launcher_spec.default.k8s`` or in the site-specific ``k8s`` block.
* ``python_path``: Python executable inside the job image. If omitted, the
  launcher uses ``job_launcher.default_python_path`` from the prepared site
  runtime config.
* ``cpu`` and ``memory``: container limits. When ``cpu_request`` or
  ``memory_request`` is omitted, the request matches the corresponding limit.
* ``cpu_request`` and ``memory_request``: optional requests when the request
  should be lower than the limit.
* ``ephemeral_storage``: Kubernetes quantity string for the job workspace
  ``emptyDir.sizeLimit`` and the container ``ephemeral-storage`` request and
  limit. Set this in ``launcher_spec.default.k8s`` or in a site-specific
  ``launcher_spec[site].k8s`` block. If omitted, the built-in launcher default
  is used. The current ``deploy prepare`` runtime config does not expose
  ``job_launcher.ephemeral_storage`` as a ``k8s.yaml`` setting.

Job pods are created with ``imagePullPolicy: Always``. Tag changes take effect
immediately, but every submitted job pulls the image once per site. For private
registries, factor this into rate limits and registry-credential plumbing. Use
``job_launcher.image_pull_secrets`` when dynamically launched job pods need
explicit image pull Secrets.

``resource_spec`` remains scheduler-facing. New jobs should place K8s launcher
settings in ``launcher_spec`` and resource requests such as ``num_of_gpus`` in
``resource_spec``. The launcher writes ``resource_spec[site].num_of_gpus`` as
both the ``nvidia.com/gpu`` request and limit.

GPU requests require the NVIDIA GPU Operator or NVIDIA device plugin on the
target cluster. For MIG, make sure the device plugin exposes a resource that the
launcher requests. The built-in launcher writes ``nvidia.com/gpu`` for
``num_of_gpus``; clusters that expose only profile-specific resources such as
``nvidia.com/mig-1g.5gb`` require cluster configuration or launcher
customization to request those resource names.

Reprovisioning and Upgrades
===========================

Provisioned certificates, local config, server communication settings, and
prepared Kubernetes parent-Service settings are tied to the provisioned project
state. If you change ``project.yml``, server host names, ports, participants,
or ``k8s.yaml`` settings:

#. Run ``nvflare provision`` again.
#. Run ``nvflare deploy prepare`` again for every affected participant.
#. Back up any PVC content you need to keep before restaging. On the server
   workspace PVC, that usually includes ``transfer/`` (admin uploads), the
   site directory holding job history and snapshots, and any log files at the
   workspace root. Client workspace PVCs typically have little to preserve
   beyond optional logs.
#. Replace the staged ``startup/`` and ``local/`` folders on the participant's
   workspace PVC. Remove stale copies first so old certificates or config files
   do not remain.
#. Run ``helm upgrade`` for the affected release.

Do not reuse an old staged ``startup/`` or ``local/`` folder after
reprovisioning.

Troubleshooting
===============

PVC stays ``Pending``
---------------------

Check that the cluster has a default storage class, or add an explicit
``storageClassName`` under each PVC ``spec``:

.. code-block:: bash

   kubectl get storageclass
   kubectl -n "$NAMESPACE" describe pvc nvflws
   kubectl -n "$NAMESPACE" describe pvc nvfldata

Use ``storageClassName: ""`` only when binding to a pre-created PersistentVolume
without a dynamic storage class.

Parent pod has ``ImagePullBackOff``
-----------------------------------

Confirm that the parent image exists and that the cluster can pull it:

.. code-block:: bash

   kubectl -n "$NAMESPACE" describe pod -l app.kubernetes.io/instance=server
   kubectl -n "$NAMESPACE" describe pod -l app.kubernetes.io/instance=site-1

Check the rendered image:

.. code-block:: bash

   helm -n "$NAMESPACE" get values server --all
   helm -n "$NAMESPACE" get values site-1 --all

For private registries, configure node credentials or add image pull secrets as
described in `Private Registry and Image Pull Secrets`_.

Parent pod cannot find ``startup`` or ``local``
-----------------------------------------------

The prepared kit was copied to the wrong level in the PVC, or the wrong PVC is
mounted. The configured workspace mount path must contain:

.. code-block:: text

   <workspace_mount_path>/startup
   <workspace_mount_path>/local

With the example default ``workspace_mount_path``, those paths are:

.. code-block:: text

   /var/tmp/nvflare/workspace/startup
   /var/tmp/nvflare/workspace/local

Use the helper pod from `Workspace PVC`_ to inspect ``/mnt/nvflws`` and restage
``startup/`` and ``local/`` from the prepared folder.

Parent starts but cannot launch job pods
----------------------------------------

Check the parent logs for Kubernetes import or authorization failures:

.. code-block:: bash

   kubectl -n "$NAMESPACE" logs deploy/server --tail=200
   kubectl -n "$NAMESPACE" auth can-i create pods \
       --as=system:serviceaccount:"$NAMESPACE":server
   kubectl -n "$NAMESPACE" auth can-i create secrets \
       --as=system:serviceaccount:"$NAMESPACE":server

If the logs show that the ``kubernetes`` Python package is missing, rebuild the
parent image with the NVFlare ``K8S`` extra or ``pip install kubernetes``.

If the logs show ``SSLCertVerificationError`` with
``CA cert does not include key usage extension``, the parent Kubernetes client
is rejecting the cluster API-server CA. This is known to affect some MicroK8s
CA certificates that omit the X.509 ``keyUsage`` extension; see
`canonical/microk8s#4864 <https://github.com/canonical/microk8s/issues/4864>`__.
Regenerate or replace the cluster CA with an RFC 5280-compliant CA. As a
temporary compatibility workaround for development clusters, use a custom
parent image based on Python 3.12 or earlier. Do not disable Kubernetes API TLS
verification in production.

Job pod stays ``Pending`` or ``Unknown``
----------------------------------------

When a submitted job cannot start because an SJ or CJ job pod remains
``Pending`` or ``Unknown`` longer than ``job_launcher.pending_timeout`` seconds,
NVFLARE deletes the stuck pod and marks the job as
``FINISHED:EXECUTION_EXCEPTION``. Check cluster scheduling events:

.. code-block:: bash

   kubectl -n "$NAMESPACE" get pods
   kubectl -n "$NAMESPACE" describe pod <job-pod-name>
   kubectl -n "$NAMESPACE" get events --sort-by=.lastTimestamp

Common causes include insufficient CPU, memory, GPU, or ephemeral storage;
missing study-data PVCs; image pull failures; and missing GPU device-plugin
resources.

Job pod cannot pull its image
-----------------------------

Job pods use the image from the submitted job's ``launcher_spec`` and set
``imagePullPolicy: Always``. Confirm the job image name and configure registry
credentials for dynamically launched pods. Use
``job_launcher.image_pull_secrets`` in ``k8s.yaml`` for explicit Secret
references, or rely on node-level credentials or the namespace default
ServiceAccount if your cluster is configured that way.

Client cannot connect to the server
-----------------------------------

Verify these items:

* ``default_host`` in ``project.yml`` matches the DNS name used by the client.
* The DNS name resolves from the client cluster.
* The server cluster exposes ``fed_learn_port``.
* The server certificate includes the DNS name in ``host_names``.
* Network policy and firewalls allow outbound client traffic to the server.

Run a DNS check from the client cluster:

.. code-block:: bash

   kubectl -n "$NAMESPACE" run dns-test --rm -it \
       --image=busybox:1.36 -- \
       nslookup server1.example.com

If you change ``default_host`` or ``host_names``, reprovision, restage the
updated folders, and redeploy the charts.

Uninstall
=========

To stop a participant installed by Helm:

.. code-block:: bash

   helm uninstall server -n "$NAMESPACE"
   helm uninstall site-1 -n "$NAMESPACE"

Delete the namespace only if it is dedicated to this deployment:

.. code-block:: bash

   kubectl delete namespace "$NAMESPACE"

Depending on the storage class reclaim policy, PVC-backed volumes may remain
after deleting Helm releases or namespaces. Remove retained volumes only after
confirming that the startup kits, logs, snapshots, job history, and study data
no longer need to be preserved.