.. _differential_privacy: ############################## Differential Privacy in FLARE ############################## Overview ======== Differential Privacy (DP) provides mathematically rigorous privacy guarantees for federated learning. FLARE supports DP at two levels: - **Local DP (client-side)** -- Privacy filters applied to model updates before sending to the server - **Sample-level DP (training-time)** -- DP-SGD integration via `Opacus `_ for per-sample gradient clipping and noise injection during training Both approaches can be combined with other FLARE privacy mechanisms (homomorphic encryption, secure aggregation) for defense-in-depth. DP-SGD with Opacus (Sample-Level DP) ===================================== For the strongest per-sample privacy guarantees, use DP-SGD during local training. The :doc:`Hello Differential Privacy ` example demonstrates a complete federated DP workflow: - **Gradient Clipping** -- Per-sample gradients are clipped to bound sensitivity - **Noise Addition** -- Calibrated Gaussian noise is added to clipped gradients - **Privacy Accounting** -- Privacy budget (epsilon, delta) is tracked across rounds The privacy-utility trade-off is controlled by epsilon: - **Lower epsilon** = stronger privacy, more noise, lower accuracy - **Higher epsilon** = weaker privacy, less noise, higher accuracy See the full walkthrough: :doc:`Hello Differential Privacy ` Privacy-Preserving Filters (Model Update DP) ============================================= FLARE's :ref:`filter mechanism ` lets you apply privacy transformations to model updates before they leave the client. These filters are configured in the job definition and run automatically. **Built-in privacy filters** (in ``nvflare.app_common.filters``): ``PercentilePrivacy`` Implements the "largest percentile to share" policy from `Shokri & Shmatikov (CCS '15) `_. Only weight differences above a configurable percentile are shared; smaller values are zeroed out. Parameters: ``percentile`` (default 10), ``gamma`` (clipping threshold, default 0.01) ``SVTPrivacy`` Implements the Sparse Vector Technique (SVT) for differential privacy. Uses Laplace noise and a threshold mechanism to selectively share weight updates, with filter-level privacy accounting for the SVT selection and release steps. Parameters: ``fraction`` (default 0.1), ``epsilon`` (default 0.1), ``noise_var`` (default 0.1) .. note:: ``SVTPrivacy`` uses a practical **filter-level** privacy accountant, which is different from the sample-level accountant used by DP-SGD libraries such as Opacus. For stronger sample-level privacy accounting, use DP-SGD during local training. See :doc:`Hello Differential Privacy ` for the Opacus-based example. The current implementation models one filter invocation as the composition of three pure-DP phases: - threshold noise with budget ``epsilon_threshold`` - query/acceptance noise with budget ``epsilon_query`` - release noise with budget ``epsilon_release`` For one call, the accountant reports: ``epsilon_call = epsilon_threshold + epsilon_query + epsilon_release`` Across repeated calls on the same filter instance, the accountant uses straight sequential composition: ``epsilon_total = sum(epsilon_call over calls)`` This is a conservative pure-DP odometer with ``delta = 0``. It is useful for tracking the cumulative privacy budget spent by the filter across rounds, but it is **not** equivalent to the end-to-end ``(epsilon, delta)`` privacy accounting used for DP-SGD training with Opacus. In particular: - no subsampling amplification is assumed - no RDP/PRV/GDP accountant is used - the guarantee is scoped to this filter mechanism, not the whole training procedure - scalar passthrough values are not noise-protected by the SVT accountant and are flagged in the filter metadata For backward compatibility, if ``epsilon_release`` is not specified, the filter derives it from the legacy ``noise_var`` parameter so the release-noise scale remains unchanged. ``StatisticsPrivacyFilter`` Applies privacy cleansing to federated statistics computations, ensuring that summary statistics shared across sites do not leak individual data points. Usage Example ------------- To add a privacy filter to a job, configure it as a ``task_result_filter`` on the client: .. code-block:: python from nvflare.app_common.filters.percentile_privacy import PercentilePrivacy # In job configuration, add as a result filter: privacy_filter = PercentilePrivacy(percentile=10, gamma=0.01) For filter configuration in job configs, see :ref:`Data Privacy & Filters `. Combining DP with Other Privacy Mechanisms ========================================== FLARE supports layered privacy: - **DP + Homomorphic Encryption**: Apply DP filters before HE-encrypted aggregation for both input and output privacy - **DP + Confidential Computing**: Run DP-protected training inside hardware TEEs for additional protection against infrastructure attacks - **DP + Secure Aggregation**: Combine DP noise with secure aggregation protocols See :doc:`/system_architecture/security_overview` for an overview of all security mechanisms. Resources ========= - :doc:`Hello Differential Privacy ` -- Complete DP-SGD example with Opacus - :ref:`Data Privacy & Filters ` -- Filter mechanism and configuration - :ref:`Filters Programming Guide ` -- How filters work in FLARE - `Opacus Documentation `_ -- DP-SGD library for PyTorch