nvflare.fuel.hci.client.fl_admin_api module

class FLAdminAPI(overseer_agent: OverseerAgent, ca_cert: str = '', client_cert: str = '', client_key: str = '', upload_dir: str = '', download_dir: str = '', cmd_modules: List | None = None, user_name: str | None = None, insecure=False, debug=False, session_timeout_interval=None, session_status_check_interval=None, auto_login_max_tries: int = 5)[source]

Bases: AdminAPI, FLAdminAPISpec

FLAdminAPI serves as foundation for communications to FL server through the AdminAPI.

Upon initialization, FLAdminAPI will start the overseer agent to get the active server and then try to log in. This happens in a thread, so code that executes after should check that the FLAdminAPI is successfully logged in.

Parameters:
  • ca_cert – path to CA Cert file, by default provisioned rootCA.pem

  • client_cert – path to admin client Cert file, by default provisioned as client.crt

  • client_key – path to admin client Key file, by default provisioned as client.key

  • upload_dir – File transfer upload directory. Folders uploaded to the server to be deployed must be here. Folder must already exist and be accessible.

  • download_dir – File transfer download directory. Can be same as upload_dir. Folder must already exist and be accessible.

  • cmd_modules – command modules to load and register. Note that FileTransferModule is initialized here with upload_dir and download_dir if cmd_modules is None.

  • overseer_agent – initialized OverseerAgent to obtain the primary service provider to set the host and port of the active server

  • user_name – Username to authenticate with FL server

  • insecure – Whether or not to use secure communication, poc was the name of this arg before version 2.4.

  • debug – Whether to print debug messages. False by default.

  • session_timeout_interval – if specified, automatically close the session after inactive for this long

  • session_status_check_interval – how often to check session status with server

  • auto_login_max_tries – maximum number of tries to auto-login.

abort(job_id: str, target_type: TargetType, targets: List[str] | None = None) FLAdminAPIResponse[source]

Issue a command to abort training.

Parameters:
  • job_id (str) – job id

  • target_type – server | client

  • targets – if target_type is client, targets can optionally be a list of client names

Returns: FLAdminAPIResponse

abort_job(job_id: str) FLAdminAPIResponse[source]

Abort a job that is running.

Parameters:

job_id (str) – the job id to abort

Returns: FLAdminAPIResponse

cat_target(target: str, options: str = None, file: str = None) FLAdminAPIResponse[source]

Issue cat command.

Sends the shell command to get the contents of the target’s specified file allowing for options that the cat command of admin client allows. The target can be “server” or a specific client name for example “site2”. The file is required and should contain the relative path to the file from the working directory of the target. The allowed options are “-n” to number all output lines, “-b” to number nonempty output lines, “-s” to suppress repeated empty output lines, and “-T” to display TAB characters as ^I.

Parameters:
  • target (str) – either server or single client’s client name.

  • options (str) – the options string as provided to the ls command for admin client.

  • file (str) – the path to the file to return the contents of

Returns: FLAdminAPIResponse

check_status(target_type: TargetType, targets: List[str] | None = None) FLAdminAPIResponse[source]

Checks and returns the FL status.

If target_type is server, the call does not wait for the server to retrieve information on the clients but returns the last information the server had at the time this call is made.

If target_type is client, specific clients can be specified in targets, and this call generally takes longer than the function to just check the FL server status because this one waits for communication from the server to client then back.

Note that this is still the previous training check_status, and there will be a new call to get status through InfoCollector, which will be able to get information from components.

Returns: FLAdminAPIResponse

clone_job(job_id: str) FLAdminAPIResponse[source]

Clone a job that exists by copying the job contents and providing a new job_id.

Parameters:

job_id (str) – job id of the job to clone

Returns: FLAdminAPIResponse

delete_job(job_id: str) FLAdminAPIResponse[source]

Delete the specified job and workspace from the permanent store.

Parameters:

job_id (str) – the job id to delete

Returns: FLAdminAPIResponse

download_job(job_id: str) FLAdminAPIResponse[source]

Download the specified job in the system.

Parameters:

job_id (str) – Job id for the job to download

Returns: FLAdminAPIResponse

get_active_sp() FLAdminAPIResponse[source]

Gets the active server (service provider).

Returns: FLAdminAPIResponse

get_available_apps_to_upload()[source]
get_connected_client_list() FLAdminAPIResponse[source]

A convenience function to get a list of the clients currently connected to the FL server.

Operates through the check status server call. Note that this returns the client list based on the last known statuses on the server, so it can be possible for a client to be disconnected and not yet removed from the list of connected clients.

Returns: FLAdminAPIResponse

get_working_directory(target: str) FLAdminAPIResponse[source]

Gets the workspace root directory of the specified target.

Parameters:

target (str) – either server or single client’s client name.

Returns: FLAdminAPIResponse

grep_target(target: str, options: str = None, pattern: str = None, file: str = None) FLAdminAPIResponse[source]

Issue grep command.

Sends the shell command to grep the contents of the target’s specified file allowing for options that the grep command of admin client allows. The target can be “server” or a specific client name for example “site2”. The file is required and should contain the relative path to the file from the working directory of the target. The pattern is also required. The allowed options are “-n” to print line number with output lines, “-i” to ignore case distinctions, and “-b” to print the byte offset with output lines.

Parameters:
  • target (str) – either server or single client’s client name.

  • options (str) – the options string as provided to the grep command for admin client.

  • pattern (str) – the pattern to search for

  • file (str) – the path to the file to grep

Returns: FLAdminAPIResponse

list_jobs(options: str = None) FLAdminAPIResponse[source]

List the jobs in the system.

Parameters:

options (str) – the options string as provided to the list_jobs command for admin client.

Returns: FLAdminAPIResponse

list_sp() FLAdminAPIResponse[source]

Gets the information on the available servers (service providers).

Returns: FLAdminAPIResponse

ls_target(target: str, options: str = None, path: str = None) FLAdminAPIResponse[source]

Issue ls command to retrieve the contents of the path.

Sends the shell command to get the directory listing of the target allowing for options that the ls command of admin client allows. If no path is specified, the contents of the working directory are returned. The target can be “server” or a specific client name for example “site2”. The allowed options are: “-a” for all, “-l” to use a long listing format, “-t” to sort by modification time newest first, “-S” to sort by file size largest first, “-R” to list subdirectories recursively, “-u” with -l to show access time otherwise sort by access time.

Parameters:
  • target (str) – either server or single client’s client name.

  • options (str) – the options string as provided to the ls command for admin client.

  • path (str) – optionally, the path to specify (relative to the working directory of the specified target)

Returns: FLAdminAPIResponse

promote_sp(sp_end_point: str) FLAdminAPIResponse[source]

Sends command through overseer_agent to promote the specified sp_end_point to become the active server.

Parameters:

sp_end_point – service provider end point to promote to active in the form of server:fl_port:admin_port like example.com:8002:8003

Returns: FLAdminAPIResponse

remove_client(targets: List[str]) FLAdminAPIResponse[source]

Issue a command to remove a specific FL client or FL clients.

Note that the targets will not be able to start with an API command after shutting down. Also, you will not be able to issue admin commands through the server to that client until the client is restarted (this includes being able to issue the restart command through the API).

Parameters:

targets – a list of client names

Returns: FLAdminAPIResponse

reset_errors(job_id: str) FLAdminAPIResponse[source]

Resets the collector errors.

Parameters:

job_id (str) – job id

Returns: FLAdminAPIResponse

restart(target_type: TargetType, targets: List[str] | None = None) FLAdminAPIResponse[source]

Issue a command to restart the specified target.

If the target is server, all FL clients will be restarted as well.

Parameters:
  • target_type – server | client

  • targets – if target_type is client, targets can optionally be a list of client names

Returns: FLAdminAPIResponse

set_timeout(timeout: float) FLAdminAPIResponse[source]

Sets the timeout for admin commands on the server in seconds.

This timeout is the maximum amount of time the server will wait for replies from clients. If the timeout is too short, the server may not receive a response because clients may not have a chance to reply.

Parameters:

timeout – timeout in seconds of admin commands to set on the server

Returns: FLAdminAPIResponse

show_errors(job_id: str, target_type: TargetType, targets: List[str] | None = None) FLAdminAPIResponse[source]

Gets and shows errors from the Info Collector.

Parameters:
  • job_id (str) – job id

  • target_type – server | client

  • targets – if target_type is client, targets can optionally be a list of client names

Returns: FLAdminAPIResponse

show_stats(job_id: str, target_type: TargetType, targets: List[str] | None = None) FLAdminAPIResponse[source]

Gets and shows stats from the Info Collector.

Parameters:
  • job_id (str) – job id

  • target_type – server | client

  • targets – if target_type is client, targets can optionally be a list of client names

Returns: FLAdminAPIResponse

shutdown(target_type: TargetType, targets: List[str] | None = None) FLAdminAPIResponse[source]

Issue a command to stop FL entirely for a specific FL client or specific FL clients.

Note that the targets will not be able to start with an API command after shutting down.

Parameters:
  • target_type – server | client

  • targets – if target_type is client, targets can optionally be a list of client names

Returns: FLAdminAPIResponse

shutdown_system() FLAdminAPIResponse[source]
submit_job(job_folder: str) FLAdminAPIResponse[source]

Submit a job.

Assumes job folder is in the upload_dir set in API init.

Parameters:

job_folder (str) – name of the job folder in upload_dir to submit

Returns: FLAdminAPIResponse

tail_target_log(target: str, options: str = None) FLAdminAPIResponse[source]

Returns the end of target’s log allowing for options that the tail of admin client allows.

The option “-n” can be used to specify the number of lines for example “-n 100”, or “-c” can specify the number of bytes.

Parameters:
  • target (str) – either server or single client’s client name.

  • options (str) – the options string as provided to the tail command for admin client. For this command, “-n” can be used to specify the number of lines for example “-n 100”, or “-c” can specify the number of bytes.

Returns: FLAdminAPIResponse

wait_until_client_status(interval: int = 10, timeout: int = None, callback: ~typing.Callable[[~nvflare.fuel.hci.client.fl_admin_api_spec.FLAdminAPIResponse, ~typing.List | None], bool] = <function default_client_status_handling_cb>, fail_attempts: int = 6, **kwargs) FLAdminAPIResponse[source]

This is similar to wait_until_server_status() and is an example for using other information from a repeated call, in this case check_status(TargetType.CLIENT). Custom code can be written to use any data available from any call to make decisions for how to proceed. Take caution that the conditions will be met at some point, or timeout should be set with logic outside this function to handle checks for potential errors or this may loop indefinitely.

Parameters:
  • interval – in seconds, the time between consecutive checks of the server

  • timeout – if set, the amount of time this function will run until before returning a response message

  • callback – the reply from show_stats(TargetType.SERVER) will be passed to the callback, along with any additional kwargs

  • logic. (which can go on to perform additional) –

  • fail_attempts – number of consecutive failed attempts of getting the server status before returning with ERROR_RUNTIME.

Returns: FLAdminAPIResponse

wait_until_server_stats(interval: int = 10, timeout: int = None, callback: ~typing.Callable[[~nvflare.fuel.hci.client.fl_admin_api_spec.FLAdminAPIResponse, ~typing.List | None], bool] = <function default_stats_handling_cb>, fail_attempts: int = 6, **kwargs) FLAdminAPIResponse[source]

This is similar to wait_until_server_status() and is an example for using other information from a repeated call, in this case show_stats(TargetType.SERVER). Custom code can be written to use any data available from any call to make decisions for how to proceed. Take caution that the conditions will be met at some point, or timeout should be set with logic outside this function to handle checks for potential errors or this may loop indefinitely.

Parameters:
  • interval – in seconds, the time between consecutive checks of the server

  • timeout – if set, the amount of time this function will run until before returning a response message

  • callback – the reply from show_stats(TargetType.SERVER) will be passed to the callback, along with any additional kwargs

  • logic. (which can go on to perform additional) –

  • fail_attempts – number of consecutive failed attempts of getting the server status before returning with ERROR_RUNTIME.

Returns: FLAdminAPIResponse

wait_until_server_status(interval: int = 20, timeout: int = None, callback: ~typing.Callable[[~nvflare.fuel.hci.client.fl_admin_api_spec.FLAdminAPIResponse, ~typing.List | None], bool] = <function default_server_status_handling_cb>, fail_attempts: int = 3, **kwargs) FLAdminAPIResponse[source]

Wait until provided callback returns True.

There is the option to specify a timeout and interval to check the server status. If no callback function is provided, the default callback returns True when the server status is “training stopped”. A custom callback can be provided to add logic to handle checking for other conditions. A timeout should be set in case there are any error conditions that result in the system being stuck in a state where the callback never returns True.

Parameters:
  • interval (int) – in seconds, the time between consecutive checks of the server

  • timeout (int) – if set, the amount of time this function will run until before returning a response message

  • callback – the reply from check_status_server() will be passed to the callback, along with any additional kwargs

  • logic. (which can go on to perform additional) –

  • fail_attempts (int) – number of consecutive failed attempts of getting the server status before returning with ERROR_RUNTIME.

Returns: FLAdminAPIResponse

write_error(error: str) None[source]

Internally used to handle errors from FileTransferModule

default_client_status_handling_cb(reply: FLAdminAPIResponse) bool[source]
default_server_status_handling_cb(reply: FLAdminAPIResponse, **kwargs) bool[source]
default_stats_handling_cb(reply: FLAdminAPIResponse) bool[source]
wrap_with_return_exception_responses(func)[source]

Decorator on all FLAdminAPI calls to handle any raised exceptions and return the fitting error status.