NVIDIA FLARE Workspace¶
NVIDIA FLARE maintains a workspace for keeping the FL apps and execution results for different jobs
under folders with the name of the job_id
.
The following is the workspace folder structure when running NVIDIA FLARE for the server and clients.
Server¶
/some_path_on_fl_server/fl_server_workspace_root/
admin_audit.log
log.txt
startup/
authorization.json
fed_server.json
log.config
readme.txt
rootCA.pem
server_context.tenseal
server.crt
server.key
signature.pkl
start.sh
stop_fl.sh
sub_start.sh
transfer/
aefdb0a3-6fbb-4c53-a677-b6951d6845a6/
app_server/
...
config_fed_server.json
fl_app.txt
log.txt
baaf8789-e83f-4863-b085-3ca95303e6bc/
app_server/
...
config/
config_fed_server.json
fl_app.txt
log.txt
In each job_id
folder, there is the app_server
folder that contains the NVIDIA FLARE Application that is running
on the server for this job_id
.
The log.txt
inside each job_id
folder are the loggings of this job.
While the log.txt
under server folder is the log for the server control process.
The startup
folder contains the config and the scripts to start the FL server program.
Accessing server-side workspace¶
When the job is running, each job will have a corresponding workspace under the server
folder.
When the job is finished, the server side workspace will be removed. The workspace will be saved into the JobStorage.
You can issue the download_job [JOB_ID]
in the admin client to download the server side workspace.
The downloaded workspace will be in [DOWNLOAD_DIR]/[JOB_ID]/workspace/
.
Note
If you issue download_job
before the job is finished, the workspace folder will be empty.
Client¶
/some_path_on_fl_client/fl_client_workspace_root/
log.txt
startup/
client_context.tenseal
client.crt
client.key
fed_client.json
log.config
readme.txt
rootCA.pem
signature.pkl
start.sh
stop_fl.sh
sub_start.sh
transfer/
aefdb0a3-6fbb-4c53-a677-b6951d6845a6/
app_clientA/
...
config_fed_client.json
fl_app.txt
log.txt
baaf8789-e83f-4863-b085-3ca95303e6bc/
app_clientA/
...
config/
config_fed_client.json
fl_app.txt
log.txt
In each job_id
folder, there is the app_clientname
folder that contains the NVIDIA FLARE Application that is running
on the client for this job_id
.
The log.txt
inside each job_id
folder are the loggings of this job.
While the log.txt
under client folder is the log for the client control process.
The startup
folder contains the config and the scripts to start the FL client program.
The Workspace
object is available through the FLContext.
From the Workspace, you can access each folder location accordingly
workspace = fl_ctx.get_prop(FLContextKey.WORKSPACE_OBJECT)
...
startup (optional)
provisioned content
fed_client.json
run_1
app
config (required)
configurations
custom (optional)
custom python code
other_folder (app defined)
log.txt
job_meta.json
...
Args:
root_dir: root directory of the workspace
site_name: site name of the workspace
config_folder: where to find required config inside an app
"""
self.root_dir = root_dir
self.site_name = site_name
self.config_folder = config_folder
# check to make sure the workspace is valid
if not os.path.isdir(root_dir):
raise RuntimeError(f"invalid workspace {root_dir}: it does not exist or not a valid dir")
startup_dir = self.get_startup_kit_dir()
if not os.path.isdir(startup_dir):
raise RuntimeError(
f"invalid workspace {root_dir}: missing startup folder '{startup_dir}' or not a valid dir"
)
site_dir = self.get_site_config_dir()
if not os.path.isdir(site_dir):
raise RuntimeError(
f"invalid workspace {root_dir}: missing site config folder '{site_dir}' or not a valid dir"
)
def _fallback_path(self, file_names: [str]):
for n in file_names:
f = self.get_file_path_in_site_config(n)
if os.path.exists(f):
return f
return None
def get_authorization_file_path(self):
return self._fallback_path(
[WorkspaceConstants.AUTHORIZATION_CONFIG, WorkspaceConstants.DEFAULT_AUTHORIZATION_CONFIG]
)
def get_resources_file_path(self):
return self._fallback_path([WorkspaceConstants.RESOURCES_CONFIG, WorkspaceConstants.DEFAULT_RESOURCES_CONFIG])
def get_job_resources_file_path(self):
return self.get_file_path_in_site_config(WorkspaceConstants.JOB_RESOURCES_CONFIG)
def get_log_config_file_path(self):
return self._fallback_path([WorkspaceConstants.LOGGING_CONFIG, WorkspaceConstants.DEFAULT_LOGGING_CONFIG])
def get_file_path_in_site_config(self, file_basename: Union[str, List[str]]):
if isinstance(file_basename, str):
return os.path.join(self.get_site_config_dir(), file_basename)
elif isinstance(file_basename, list):
return self._fallback_path(file_basename)
else:
raise ValueError(f"invalid file_basename '{file_basename}': must be str or List[str]")
def get_file_path_in_startup(self, file_basename: str):
return os.path.join(self.get_startup_kit_dir(), file_basename)
def get_file_path_in_root(self, file_basename: str):
return os.path.join(self.root_dir, file_basename)
def get_server_startup_file_path(self):
# this is to get the full path to "fed_server.json"
return self.get_file_path_in_startup(WorkspaceConstants.SERVER_STARTUP_CONFIG)
def get_server_app_config_file_path(self, job_id):
return os.path.join(self.get_app_config_dir(job_id), WorkspaceConstants.SERVER_APP_CONFIG)
def get_client_app_config_file_path(self, job_id):
return os.path.join(self.get_app_config_dir(job_id), WorkspaceConstants.CLIENT_APP_CONFIG)
def get_client_startup_file_path(self):
# this is to get the full path to "fed_client.json"
return self.get_file_path_in_startup(WorkspaceConstants.CLIENT_STARTUP_CONFIG)
def get_admin_startup_file_path(self):
# this is to get the full path to "fed_admin.json"
return self.get_file_path_in_startup(WorkspaceConstants.ADMIN_STARTUP_CONFIG)
def get_site_config_dir(self) -> str:
return os.path.join(self.root_dir, WorkspaceConstants.SITE_FOLDER_NAME)
def get_site_custom_dir(self) -> str:
return os.path.join(self.get_site_config_dir(), WorkspaceConstants.CUSTOM_FOLDER_NAME)
def get_startup_kit_dir(self) -> str:
return os.path.join(self.root_dir, WorkspaceConstants.STARTUP_FOLDER_NAME)
def get_audit_file_path(self) -> str:
return os.path.join(self.root_dir, WorkspaceConstants.AUDIT_LOG)
def get_log_file_path(self) -> str:
return os.path.join(self.root_dir, WorkspaceConstants.LOG_FILE_NAME)
def get_root_dir(self) -> str:
return self.root_dir
def get_run_dir(self, job_id: str) -> str:
return os.path.join(self.root_dir, WorkspaceConstants.WORKSPACE_PREFIX + str(job_id))
def get_app_dir(self, job_id: str) -> str:
return os.path.join(self.get_run_dir(job_id), WorkspaceConstants.APP_PREFIX + self.site_name)
def get_app_log_file_path(self, job_id: str) -> str:
return os.path.join(self.get_run_dir(job_id), WorkspaceConstants.LOG_FILE_NAME)
def get_app_error_log_file_path(self, job_id: str) -> str:
return os.path.join(self.get_run_dir(job_id), WorkspaceConstants.ERROR_LOG_FILE_NAME)
def get_app_config_dir(self, job_id: str) -> str:
return os.path.join(self.get_app_dir(job_id), self.config_folder)
def get_app_custom_dir(self, job_id: str) -> str:
return os.path.join(self.get_app_dir(job_id), WorkspaceConstants.CUSTOM_FOLDER_NAME)
def get_job_meta_path(self, job_id: str) -> str:
return os.path.join(self.get_run_dir(job_id), WorkspaceConstants.JOB_META_FILE)
def get_site_privacy_file_path(self):
return self.get_file_path_in_site_config(WorkspaceConstants.PRIVACY_CONFIG)
def get_client_custom_dir(self) -> str:
return os.path.join(self.get_site_config_dir(), WorkspaceConstants.CUSTOM_FOLDER_NAME)
def get_stats_pool_summary_path(self, job_id: str, prefix=None) -> str:
file_name = WorkspaceConstants.STATS_POOL_SUMMARY_FILE_NAME
if prefix:
file_name = f"{prefix}.{file_name}"
return os.path.join(self.get_run_dir(job_id), file_name)
def get_stats_pool_records_path(self, job_id: str, prefix=None) -> str:
file_name = WorkspaceConstants.STATS_POOL_RECORDS_FILE_NAME
if prefix:
file_name = f"{prefix}.{file_name}"
return os.path.join(self.get_run_dir(job_id), file_name)
def get_config_files_for_startup(self, is_server: bool, for_job: bool) -> list:
"""Get all config files to be used for startup of the process (SP, SJ, CP, CJ).
We first get required config files:
- the startup file (fed_server.json or fed_client.json) in "startup" folder
- resource file (resources.json.default or resources.json) in "local" folder
We then try to get resources files (usually generated by different builders of the Provision system):
- resources files from the "startup" folder take precedence
- resources files from the "local" folder are next
These extra resource config files must be json and follow the following patterns:
- *__resources.json: these files are for both parent process and job processes
- *__p_resources.json: these files are for parent process only
- *__j_resources.json: these files are for job process only
Args:
is_server: whether this is for server site or client site
for_job: whether this is for job process or parent process
Returns: a list of config file names
"""
if is_server:
startup_file_path = self.get_server_startup_file_path()
else:
startup_file_path = self.get_client_startup_file_path()
resource_config_path = self.get_resources_file_path()
config_files = [startup_file_path, resource_config_path]
if for_job:
# this is for job process
job_resources_file_path = self.get_job_resources_file_path()
if os.path.exists(job_resources_file_path):
config_files.append(job_resources_file_path)
# add other resource config files
patterns = [WorkspaceConstants.RESOURCE_FILE_NAME_PATTERN]
if for_job:
patterns.append(WorkspaceConstants.JOB_RESOURCE_FILE_NAME_PATTERN)
else:
patterns.append(WorkspaceConstants.PARENT_RESOURCE_FILE_NAME_PATTERN)
# add startup files first, then local files
self._add_resource_files(self.get_startup_kit_dir(), config_files, patterns)
self._add_resource_files(self.get_site_config_dir(), config_files, patterns)
return config_files
@staticmethod
def _add_resource_files(from_dir: str, to_list: list, patterns: [str]):
for p in patterns:
files = glob.glob(os.path.join(from_dir, p))
if files:
to_list.extend(files)