Migration Guide
This guide covers API and configuration changes when upgrading between FLARE releases.
Upgrading from 2.7.2 to 2.8.0
Python and Removed Legacy Surfaces
FLARE 2.8.0 targets Python 3.10 through 3.14. Python 3.9 is no longer listed as a supported development target.
The deprecated FLAdminAPI surface has been removed. Use the FLARE API, Recipe
API, Client API, and nvflare CLI workflows for new automation.
HA/Overseer code has also been removed from the 2.8 branch.
Client API Subprocess Timeout Validation
Subprocess-mode Client API jobs now validate two large-model safety settings at job initialization:
download_complete_timeoutmust not beNone. The subprocess must stay alive aftersend_to_peer()ACKs so the server can finish downloading tensors from the subprocessDownloadService.max_resendsmust not beNonewhen usingClientAPILauncherExecutor. Unlimited resends can turn one delayed large-model transfer into an unbounded series of replacement download transactions.
If your 2.7.x job explicitly set either value to None, update it before
running on 2.8.0. Recipe-based external-process jobs already serialize the
default max_resends=3 in executor args, so the following setting is only
needed when overriding a previous explicit None or choosing a different
retry budget:
recipe.add_client_config({
"download_complete_timeout": 1800,
"max_resends": 3, # finite non-negative integer; 0 disables retries
})
For large tensor or NumPy payloads, also keep the related streaming timeouts
consistent. If you explicitly raise tensor_streaming_per_request_timeout or
np_streaming_per_request_timeout, set PEER_READ_TIMEOUT and
download_complete_timeout to values at least as large as the configured
streaming per-request timeout, and keep tensor_min_download_timeout or
np_min_download_timeout at least as large as the same value.
recipe.add_client_config({
"tensor_streaming_per_request_timeout": 600,
"tensor_min_download_timeout": 600,
"PEER_READ_TIMEOUT": 600,
"download_complete_timeout": 1800,
"max_resends": 3,
})
Late Retry Handling for Finished Download Refs
FLARE 2.8.0 makes finished DownloadService refs retry-safe for the same
requester. If a client completed a large download but retries because the final
EOF response was delayed, the server returns the same terminal status instead of
INVALID_REQUEST / no ref found. This is an internal reliability fix and
does not require job-code changes, but it is most effective when the subprocess
timeouts above are configured consistently for very large models.
Upcoming Main-Branch Changes
FLARE API Compatibility Note
On the current main branch, NoConnection
now subclasses Python’s built-in ConnectionError instead of directly subclassing
Exception.
Impact:
Existing code that catches
ConnectionErrorwill now also catchNoConnection.Existing code that catches
NoConnectioncontinues to work unchanged.
If your application distinguishes FLARE connection failures from broader OS or
network exceptions, review any broad except ConnectionError: handlers before
upgrading to the next release built from main.
FLARE API Lifecycle Restriction
On the current main branch, Session.shutdown
and Session.restart
are now restricted to TargetType.SERVER only.
Impact:
Existing callers that pass
TargetType.ALLorTargetType.CLIENTwill now fail.Server-scoped lifecycle control continues to work unchanged.
Session.shutdown_systemis unchanged and still supports whole-system shutdown.
For whole local PoC lifecycle control, use the PoC start/stop flow instead of the general system admin API.
CLI Startup Kit Resolution Change
On the current main branch, server-connected CLI commands use a shared
active startup kit registry in ~/.nvflare/config.conf.
Impact:
Use
nvflare config add <id> <startup-kit-dir>andnvflare config use <id>to register and activate a startup kit.nvflare config -d/--startup_kit_dirremains accepted for compatibility with 2.7.x scripts, but is deprecated.NVFLARE_STARTUP_KIT_DIRremains an automation override and takes precedence over the active registry entry when set.nvflare config -jt/--job_templates_dirremains accepted for compatibility with 2.7.x scripts, but job template config is deprecated.Root
nvflare configcontinues to manage local settings such as the POC workspace. Startup kit paths are managed by thenvflare configsubcommands.
If you use shell profiles or CI settings that export NVFLARE_STARTUP_KIT_DIR,
review them before upgrading because they override the active registry entry.
CLI Config Flag Compatibility
On the current main branch, nvflare config keeps the 2.7.x POC
workspace flag names.
Impact:
-pwand--poc_workspace_dirremain the supported flags for setting the POC workspace.The interim development-only
--poc.workspacespelling is not part of the public compatibility contract.
If you have older scripts that use -pw or --poc_workspace_dir, they
continue to work.
Client Disable Semantics
nvflare system remove-client is not exposed as a supported public CLI
command. The legacy interactive-console remove_client command is hidden
from normal help and remains a registry cleanup operation only: it releases
the active token so the client can register again. It does not stop the client
process, revoke credentials, or prevent reconnect.
Use the new durable access-control commands when the intent is to keep a client out of the federation:
nvflare system disable-client <client> --forcepersists a disabled flag in the server workspace, removes any active registry entry, and rejects later registration or heartbeat from that client.nvflare system enable-client <client> --forceclears the disabled flag so the client can rejoin on the next registration or heartbeat.
This is operational disablement, not certificate revocation.
Study Name Validation Relaxation
On the current main branch, study names now allow underscores in internal
positions, so names such as my_study are valid.
Impact:
project.ymlvalidation now accepts study names with internal underscores.Login and study-scoped authorization paths will accept the same names.
If you maintain external validation or naming policy around study identifiers, update those checks to match the new rule before upgrading.
Site Log Configuration Restriction
On the current main branch, Session.configure_site_log
and the corresponding nvflare system log-config path now accept only simple
log levels and built-in log modes.
Impact:
JSON
dictConfigpayloads are no longer accepted for site-wide log changes.File-path based logging configs are no longer accepted for site-wide log changes.
Supported values remain the standard log levels plus built-in modes such as
concise,msg_only,full,verbose, andreload.
If you previously used advanced JSON/file-based configs with
configure_site_log, switch to the supported level/mode values before
upgrading to the next release built from main.
For dict-based or file-path logging, use configure_job_log on a running job instead.
POC Start Default Service Clarification
On the current main branch, the documented default behavior of
nvflare poc start is clarified to reflect the actual runtime behavior:
the default start set is the server plus client services, not every
participant directory under the workspace.
Impact:
Running
nvflare poc startwith no explicit-p/--servicestarts the server and clients.Admin consoles are not started unless explicitly selected.
This is a documentation/help clarification, not a runtime behavior change.
Upgrading from 2.7.0/2.7.1 to 2.7.2
Recipe API Changes
initial_model renamed to model
The initial_model parameter in all recipes has been renamed to model for clarity:
# Before (2.7.0/2.7.1)
recipe = FedAvgRecipe(
...
initial_model=SimpleNetwork(),
)
# After (2.7.2)
recipe = FedAvgRecipe(
...
model=SimpleNetwork(),
)
The model parameter now also accepts dict-based configuration with optional pretrained checkpoint:
recipe = FedAvgRecipe(
...
model={"path": "my_module.MyModel", "args": {"hidden_size": 256}},
initial_ckpt="pretrained.pt",
)
PTFedAvgEarlyStopping merged into PTFedAvg
PTFedAvgEarlyStopping has been merged into PTFedAvg with InTime aggregation support.
A backward-compatible alias is provided, but new code should use PTFedAvg:
# Before
from nvflare.app_opt.pt.fedavg_early_stopping import PTFedAvgEarlyStopping
controller = PTFedAvgEarlyStopping(...)
# After
from nvflare.app_opt.pt.fedavg import PTFedAvg
controller = PTFedAvg(...)
MONAI Integration
The separate nvflare-monai wheel package is deprecated. Use the Client API directly
for MONAI integration. See the updated examples in examples/advanced/monai/ and the
MONAI Migration Guide.
New Features (No Migration Required)
The following 2.7.2 features work automatically with no code changes:
TensorDownloader: Transparent memory optimization for PyTorch model weight transfer. See FLARE Tensor Downloader.
Server-side memory cleanup: Automatic garbage collection and heap trimming. See Memory Management.
Backward Compatibility
Job Config API: Existing
FedJob-based configurations continue to work alongside the new Recipe API.Config-based Jobs: JSON/YAML configuration-based jobs continue to work as before.
Executor/ModelLearner APIs: Still functional but no longer the recommended pattern. Use Recipe API + Client API for new projects.
For the full list of changes, see the What’s New in 2.7.2 release notes.
Upgrading from 2.5/2.6 to 2.7
FLARE 2.7.0 introduced several major changes:
Job Recipe API (technical preview): A higher-level API for creating FL jobs. See NVFlare Job Recipe.
Client API is now the recommended pattern for all new FL jobs.
Hierarchical FL: New relay-based communication hierarchy for large-scale deployments. See Hierarchical FLARE.
Edge & Mobile: Federated training on mobile devices (iOS/Android) with ExecuTorch. See Mobile Federated Training (iOS / Android).
File Streaming: Pull-based file download for large model transfers. See FLARE File Streaming.
For migrating from the older FLAdminAPI to the Client API, see Migrating to FLARE API.
For the full list of 2.7.0 changes, see What’s New in FLARE v2.7.0.