nvflare.app_common.resource_managers.gpu_resource_manager module

class GPUResource(gpu_id: int, gpu_memory: int | float)[source]

Bases: object

to_dict()[source]
class GPUResourceManager(num_of_gpus: int, mem_per_gpu_in_GiB: int | float, num_gpu_key: str = 'num_of_gpus', gpu_mem_key: str = 'mem_per_gpu_in_GiB', expiration_period: int | float = 30)[source]

Bases: AutoCleanResourceManager

Resource manager for GPUs.

Parameters:
  • num_of_gpus – Number of GPUs.

  • mem_per_gpu_in_GiB – Memory for each GPU.

  • num_gpu_key – The key in resource requirements that specify the number of GPUs.

  • gpu_mem_key – The key in resource requirements that specify the memory per GPU.

  • expiration_period – Number of seconds to hold the resources reserved. If check_resources is called but after “expiration_period” no allocate resource is called, then the reserved resources will be released.