Hello TensorFlow ================ This example demonstrates how to use `NVIDIA FLARE `_ with TensorFlow to train an image classifier using federated averaging (`FedAvg `_). TensorFlow serves as the deep learning training framework in this example. For detailed documentation, see the `Hello TensorFlow `_ example page. We recommend using the `NVIDIA TensorFlow docker `_ for GPU support. If GPU is not required, a Python virtual environment is sufficient. To run this example with the FLARE API, refer to the `hello_world notebook <../hello_world.ipynb>`_. Run NVIDIA TensorFlow Container ------------------------------- Ensure the `NVIDIA container toolkit `_ is installed. Then execute the following command: .. code-block:: bash docker run --gpus=all -it --rm -v [path_to_NVFlare]:/NVFlare nvcr.io/nvidia/tensorflow:xx.xx-tf2-py3 NVIDIA FLARE Installation ------------------------- For complete installation instructions, visit `Installation `_. .. code-block:: bash pip install nvflare clone the example code from GitHub: .. code-block:: bash git clone https://github.com/NVIDIA/NVFlare.git Navigate to the hello-tf directory: .. code-block:: bash git switch cd examples/hello-world/hello-tf Install the dependencies: .. code-block:: bash pip install -r requirements.txt Code Structure -------------- .. code-block:: text hello-pt | |-- client.py # client local training script |-- model.py # model definition |-- job.py # job recipe that defines client and server configurations |-- requirements.txt # dependencies Data ---- This example uses the `MNIST `_ handwritten digits dataset, which is loaded within the trainer code. Model ----- The `model.py` file defines a simple neural network using TensorFlow's Keras API. The `Net` model is a sequential architecture designed for image classification, featuring: - **Flatten Layer**: Prepares input data for dense layers. - **Dense Layer**: 128 units with ReLU activation for non-linearity. - **Dropout Layer**: 20% dropout rate to mitigate overfitting. - **Output Layer**: 10 units for classifying MNIST digits. This model is used in federated learning with NVIDIA FLARE, trained across clients using the FedAvg algorithm. .. literalinclude:: ../../../examples/hello-world/hello-tf/model.py :language: python :linenos: :caption: model code (model.py) :lines: 14- Client Code ----------- The client code ``client.py`` is responsible for training. The training code closely resembles standard PyTorch training code, with additional lines to handle data exchange with the server. .. literalinclude:: ../../../examples/hello-world/hello-tf/client.py :language: python :linenos: :caption: client code (client.py) :lines: 14- Server Code ----------- In federated averaging, the server code aggregates model updates from clients, following a scatter-gather workflow pattern. This example uses the default federated averaging algorithm provided by NVFlare, eliminating the need for custom server code. Job Recipe Code --------------- The job recipe includes `client.py` and the built-in FedAvg algorithm. .. literalinclude:: ../../../examples/hello-world/hello-tf/job.py :language: python :linenos: :caption: job recipe (job.py) :lines: 14- Run the Experiment ------------------ Execute the script using the job API to create the job and run it with the simulator: .. code-block:: bash TF_FORCE_GPU_ALLOW_GROWTH=true TF_GPU_ALLOCATOR=cuda_malloc_async python3 job.py Access the Logs and Results --------------------------- Find the running logs and results inside the simulator's workspace: .. code-block:: bash $ ls /tmp/nvflare/jobs/workdir Notes on Running with GPUs -------------------------- When using GPUs, TensorFlow attempts to allocate all available GPU memory at startup. To prevent this in multi-client scenarios, set the following flags: .. code-block:: bash TF_FORCE_GPU_ALLOW_GROWTH=true TF_GPU_ALLOCATOR=cuda_malloc_async If you have more GPUs than clients, consider running one client per GPU using the `--gpu` argument during simulation, e.g., `nvflare simulator -n 2 --gpu 0,1 [job]`.