IP Protection Security Architecture with FLARE and Confidential Computing
CVM on-prem with AMD CPU + NVIDIA GPU
The current architecture is based on Confidential VMs with AMD CPUs and NVIDIA GPUs for on-premise deployment. While the design principles are the same for Intel TDX CPUs with NVIDIA GPUs, the current implementation targets on-premise AMD/NVIDIA systems. TDX support and cloud-based deployment will be available soon.
Introduction
In an era where artificial intelligence drives critical decisions across industries, safeguarding the intellectual property (IP) of machine learning models has become paramount, particularly during inference and federated learning. These models, often the culmination of years of research, proprietary algorithms, and significant data investments, represent highly valuable assets. Both inference, typically executed on edge or client devices, and federated learning, which involves distributing model training across decentralized nodes, expose models to untrusted environments, creating substantial risks of IP theft or reverse engineering. Without robust IP protection, organizations face not only financial losses but also threats to their competitive advantage and compliance. Therefore, ensuring model confidentiality throughout both training and inference is crucial for secure deployment, responsible innovation, and sustained trust in AI systems.
The risks to model IP stem from multiple critical phases in the deployment time and runtime lifecycle.
Deployment-Time Risks
At deployment time, the model IP is particularly vulnerable if introduced into an untrusted or unverified environment. An untrusted host or malicious host owner can intercept the model by modifying the application code, tampering with the execution environment, or delaying the activation of security mechanisms such as attestation and encryption. Without strict controls over when and how the model is decrypted or loaded, attackers can gain early access before protections are in place. This makes the deployment phase a critical point of exposure, especially in environments where hosts are not fully controlled or are operated by third parties.
Runtime Risks
Even after deployment, model IP remains exposed to runtime threats. A host system—whether trusted or compromised—can still leak the model if sufficient safeguards are not maintained. Attackers may exploit vulnerabilities to gain remote access, copy the model from memory, intercept it over the network, or extract it from disk-based checkpoints. Insider threats or physical access to a machine can also lead to data exfiltration. While VM-based Trusted Execution Environments (TEEs) provided by Confidential Computing offer strong isolation guarantees, these mechanisms are not infallible. If the attacker can directly access the CVM TEE or modify the application inside the TEE, then the TEE protection doesn’t help the IP protection: here are a few possible ways that model IP can be leaked out at runtime:
Compromised participant machines
Unauthorized access to the remote training machine (via direct access or network access)
Remote access or a leak from the network
Leak from storage (such as a model checkpoint)
Design Proposal & Solution Overview
Challenge
Simply deploying applications in a Confidential VM (CVM) is insufficient to protect model IP. A comprehensive security architecture is required.
Proposed Solution
A secure deployment architecture combining:
- Specialized CVM Image
Hardware-backed chain of trust from hardware to application
Enhanced security controls for network, storage, and access
Measured boot and runtime attestation
- Pre-packaged Workload Container
FLARE training applications or inference services
Model weights and proprietary code
Security Guarantee
Our Minimum Viable Product (MVP) design ensures model IP remains protected throughout the entire lifecycle, from deployment through execution, even in potentially compromised environments.
Security Architecture Components
IP Protection Architecture
The high-level approach for generating a Confidential VM (CVM) image involves embedding the application workload within a secure virtual machine that leverages VM-based Trusted Execution Environment (TEE) architecture. To ensure strong security guarantees, the CVM is fully locked down—no shell access, no open ports except for explicitly whitelisted ones, and all data access restricted to encrypted disk partitions.
To protect against tampering during deployment, the boot process is anchored in Confidential Computing’s chain of trust, extending from hardware up to the application layer. Critical disk partitions are encrypted, and decryption keys are withheld until remote attestations are successfully completed. This attestation verifies both the base system and the application against expected measurements at a remote trustee service. Only after passing this check does the trustee’s key broker service release the decryption key, allowing the CVM to proceed securely.
The attestations will be completed in two stages. Once the kernel is booted normally, the attestation service will perform second-stage attestation (both CPU and GPU attestation). If the attestation is verified, the normal workload will be started.
Assumptions
We fully trust the individual who builds the CVM image, as well as the host machine used during the image creation process. This ensures that the CVM is constructed in a secure and controlled environment.
We trust the remote trustee service, including its integrated key broker service, to be secure and reliable. This design relies on the trustee service’s own protection mechanisms.
To verify the integrity and confidentiality of the CVM application’s boot process, we assume that CPU-based attestation at boot time is sufficient. Specifically, we rely on a one-time, hardware-backed attestation during CVM startup to establish trust, without requiring ongoing or continuous runtime verification.
Ongoing continuous attestation will be handled at the application level (with both GPU and CPU attestation, such as NVFlare).
Architecture Design
Key Challenges in Securing Application-Level Integrity
By Default, Chain of Trust Stops at the Kernel: Confidential Computing’s hardware-backed chain of trust typically ends at the kernel. User-level application code is not included in the default measurement and attestation process.
Application Integrity Risk: Without extending the chain of trust to cover the application, malicious modifications can occur at boot time. This risks compromising both the application’s integrity and the overall confidentiality of the system, even if kernel-level attestation is successful.
Necessity of Application Measurement: To ensure end-to-end trust, application-level measurements must be automatically calculated by the kernel and cryptographically signed by CC-enabled hardware. Relying on external or manual hash values creates potential attack vectors.
Use Case Consideration – Disk Content Not Measured: Confidential Computing attestation is designed to measure memory-loaded components during boot. Application binaries and data stored on disk are not covered. This is not a flaw in the architecture but a challenge that must be addressed for use cases requiring full application trust.
Security Implication for Application Deployment: If the application and its associated data are not part of the attested set, the CVM cannot ensure their integrity or confidentiality—posing a significant risk for secure deployment in sensitive scenarios.
Design Approach
This design addresses the above challenges with the following approaches:
Encrypted Storage: The CVM encrypts critical storage partitions to protect sensitive code and data from unauthorized access.
Customer-Specific Key: A unique decryption key is associated with each customer and stored securely in the remote key broker service, along with the expected attestation reference values.
Attestation-Bound Key Release: The decryption key is released only upon successful CPU-based attestation, ensuring it is provided exclusively to trusted environments that match both CVM and application measurements and possess valid cryptographic signatures.
Two-Stage Attestation & Two-Stage Key Release:
CPU verification → GPU verification (extending the chain of trust from CPU to GPU)
Two-stage key releases with partition
dm-verity.
Additional Security Hardening
Disk Security: Leverage both
dm-cryptfor encryption anddm-verityfor integrity verification of disk partitions. Disable auto-mount.Access Control: Disable login mechanisms, including SSH and console access, to prevent unauthorized entry into the CVM.
Network Hardening: Configure strict firewall rules and disable all unnecessary services and ports, allowing only explicitly whitelisted network access.
Reference Value and Key Storage
There are different approaches to store the reference values, leveraging:
Trustee service with remote key broker services
Trusted Platform Module (TPM)
Virtual TPM (vTPM)
For our most common deployment scenarios, we will build a CVM image on one trusted host (Host A), then distribute and deploy it to another untrusted host (Host B). In this design, we choose to use the remote trustee service.
CVM Boot-Up Process Design
Here, we are leveraging the initApp in a TEE context to enable application-level attestation, using the kernel as an indirect attesting environment.
Kernel as an Attesting Environment – via InitApp in TEE
Concept Overview
In a Confidential Computing environment (e.g., AMD SEV-SNP, Intel TDX), the kernel is already measured at boot time by the hardware-backed chain of trust. Rather than modifying the kernel or injecting measurement logic earlier in the boot flow, we delegate application-level attestation to a lightweight agent called InitApp, which runs in early user space—right after the kernel, but before any application workload or sensitive data is accessed.
Key Design Principles
Trusted Kernel Base
The kernel serves as the base of trust. It is measured by the TEE platform during boot, forming part of the trusted launch.
InitApp as Attesting Agent
InitApp is responsible for:
Performing application-level attestation
Interacting with the trustee service and key broker
InitApp Placement and Measurement
For proper attestation, InitApp must be embedded within the initramfs rather than placed in external locations such as /oem/initapp.
Measurement Scope
The attestation measurement must include:
Kernel
Kernel arguments (command line)
Initramfs
With AMD SEV-SNP, this is configured using the kernel-hashes=on flag.
Design Rationale
Embedding InitApp within initramfs ensures:
InitApp is loaded into kernel memory during boot
InitApp is automatically measured as part of the initramfs by the attestation SDK
No additional measurement mechanisms are required
Placement outside initramfs bypasses automatic measurement and creates replay attack vulnerabilities
QEMU Launch Example
sudo qemu-system-x86_64 \
-bios OVMF.amdsev.fd \
-initrd initrd.img \
-kernel vmlinuz \
-append "root=/dev/mapper/crypt_root rw console=ttyS0 pci=realloc,nocrs vm_id=__cvm_id__" \
-nographic \
-machine memory-encryption=sev0,vmport=off \
-object memory-backend-memfd,id=ram1,size=${MEM}G,share=true,prealloc=false \
-machine memory-backend=ram1 \
-object sev-snp-guest,id=sev0,cbitpos=${CBITPOS},reduced-phys-bits=1,policy=0x30000,kernel-hashes=on \
-vga none \
-enable-kvm -no-reboot \
-cpu EPYC-v4 \
-machine q35 -smp $CORES -m ${MEM}G,slots=2,maxmem=512G \
...
<rest of command>
- In this setup,
initrd.imgis loaded into kernel memory and included in the TEE measurement, securing both InitApp and its logic.AMD EPYC CPU processor EPYC-v4 is used
we use OVMF.amdsev.fd
kernel-hashes=on
What Needs to Be Measured
When preparing a Confidential VM (CVM) image, it’s crucial to ensure that key components are measured and cryptographically verified to maintain a trusted boot process.
With TEE platforms like AMD SEV-SNP or Intel TDX, the firmware measures and includes the hashes of the following in the attestation report:
Kernel binary
Initramfs (which includes InitApp)
Kernel command-line parameters
Firmware (UEFI/BIOS)
EFI boot configuration (depending on platform and setup)
These measurements are rooted in hardware and cannot be forged by the host. Any tampering with measured components—such as modifying InitApp—will result in a different TEE measurement hash. Consequently, the Trustee will detect the mismatch and deny key release, preventing decryption of sensitive data.
Note
You do not need to sign or measure the entire CVM disk image. Focusing on these critical boot-time components is sufficient to establish a robust and verifiable chain of trust.
CVM Image Measurement
The InitApp does a CVM image measurement using snpguest tool. This measurement is printed in the boot log always,
even in case of a boot failure.
What does it measure:
Component |
Measured by Default |
Measured with kernel-hashes=on |
|---|---|---|
OVMF |
✅ Yes |
✅ Yes |
Kernel (vmlinuz) |
❌ No |
✅ Yes |
initrd/initramfs |
❌ No |
✅ Yes |
Kernel args |
❌ No |
✅ Yes |
The SEV-SNP measurement is a SHA-384 hash of:
OVMF + firmware state
Kernel
Initrd
Kernel command line
Platform launch policy
Guest-supplied report_data
etc.
As long as:
Provide the same inputs to both sev-snp-measure and the runtime SEV-SNP launch process (i.e., QEMU/KVM with SEV-SNP enabled),
Don’t introduce randomness between build and runtime (e.g., dynamic kernel arguments, timestamps, UUIDs),
The measurement will match exactly.
Attestation Stages
Boot-Time Attestation - Scope: CPU only - Ensures the integrity of the CVM and the early boot process, including initApp. - Performed using the Trustee Service at startup.
Runtime Attestation - Scope: CPU + GPU - Required to protect the application workload during runtime execution. - Likely involves an application-level attestation agent. - FLARE integrates a Confidential Computing (CC) Manager that performs attestation at multiple stages, including runtime, to maintain trust across the system lifecycle.
Trustee Service Integration
Overview
To protect the model IP, confidential computing hardware alone is not sufficient. Additional infrastructure and services are required—most critically, the Trustee Service, which includes the following components:
Attestation Service
Key Broker Service
The Trustee Service must support CPU-level attestation across AMD, Intel, and ARM architectures during the boot process. For this design, we adopt the CNCF Confidential Containers (CoCo) Project Trustee Service and Guest components: 🔗 https://github.com/confidential-containers/trustee
Any other open-source or proprietary trustee service can also be used. This infrastructure is swappable.
Design Rationale
This design is chosen based on the following key factors:
Our main focus is on protecting the integrity and confidentiality of initApp during boot up.
The initApp is a small script that runs independently of the GPU, so GPU attestation is not required at this stage.
We need an open-source trustee service that has both key broker service and attestation, and basic configuration support. CoCo Trustee Service is the only option we can find at the moment.
Interactions Between NVFlare and Trustee Key Broker Service (KBS)
The following block diagram shows the interaction among the NVFlare CVM, Attestation Agent (AA), Key Broker Service (KBS), Trustee, and Attestation Service (AS).
Trustee Policies
The “trustee policy” refers to the rules and configurations governing how secrets are released and how the trustworthiness of a confidential workload is verified before granting access to sensitive data. It involves two main types of policies: resource policies and attestation policies.
Resource Policies: These policies determine which secrets are released to a specific workload, typically scoped to the container. They control what secrets are available to the workload, ensuring that only necessary information is provided.
Attestation Policies: These policies define how the claims about the Trusted Computing Base (TCB) are compared to reference values to determine the trustworthiness of the workload. They specify how the attestation process verifies that the workload is running in a trusted environment.
We only need to use resource policy with the default attestation policy.
One can set the policy to the needed measurement (hash values) or referring to the reference values.
Set Policy
Here is a policy example. The resource policy we set to ensure only CVM with the measurement matching the value can get the resource (the key for LUKS).
package policy
default allow = false
allow {
input["submods"]["cpu0"]["ear.veraison.annotated-evidence"]["snp"]["measurement"] == "Cwa8qBJimP2freTTrrpvAZVbEQEyAhPY4fZGgSn9z4qtt0CAGmcS+Otz96qQZ92k"
}
And the command to set this policy into the Trustee service.
#!/usr/bin/env bash
TRUSTEE_ADDRESS=<your organization trustee service address>
PORT=8999
ROOTCA=keys/rootCA.crt
sudo kbs-client --url https://$TRUSTEE_ADDRESS:$PORT --cert-file $ROOTCA config --auth-private-key private.key set-resource-policy --policy-file resource_policy.rego
Set & Get Resource
Here is the command for KBS client to set and get resources:
kbs-client --url https://$TRUSTEE_ADDRESS:$PORT --cert-file $ROOTCA config --auth-private-key $PRIVATE_KEY set-resource --resource-file $SECRET_FILE --path $URL_PATH
kbs-client --url https://$TRUSTEE_ADDRESS:$PORT --cert-file $ROOTCA get-resource --path $URL_PATH
Note
--path $URL_PATH: This is used for identity namespace isolation for now.
CVM Implementation Details
Disk Layout and Security
Disk Partitions
Partition |
Mount Point or host location |
Contents |
Encryption |
Notes |
|---|---|---|---|---|
Kernel + Initramfs |
host |
Kernel image, initramfs |
❌ |
Tampering causes measurement change and boot failure |
Boot Log |
host |
Early boot logs from initramfs and InitApp |
❌ |
Allows monitoring boot failures from the host |
Root Filesystem |
/root |
Full Ubuntu OS install |
dm-crypt |
Encrypted root filesystem |
App Log |
/applog |
Application logs |
❌ |
Separate image; readable after CVM shutdown |
User Config |
/user_config |
User configuration directory |
❌ |
Modifiable before CVM launch |
User Data |
/user_data |
User-provided data |
❌ |
Attached as separate image; supports NFS mount |
Temporary Files |
/tmp |
Runtime temporary files (RAM) |
TEE |
RAM disk protected by TEE |
Swap |
N/A |
N/A |
N/A |
Disabled |
Disk Security Measures
Mount Security
Auto-mounting is disabled to prevent unauthorized or accidental mounting of external devices.
Encryption
Root Filesystem: Encrypted using
dm-crypt; decryption key released only after successful attestationTemporary Storage:
/tmpis a RAM disk protected by TEE hardware encryptionUser Data: Unencrypted by design; users control data encryption externally if needed
Partition Details
Logging
bootlog - File on Host Machine
This log records the boot process and is essential during setup and debugging, especially when diagnosing boot failures. The boot log is stored on the host machine (not inside the CVM) and is writable during the boot process.
/applog - Partition on CVM Disk
This log captures application-level output (e.g., FLARE logs). It is writable to aid debugging—for instance, when investigating connectivity issues between clients and servers. The log is visible to the host and implemented as a separate image file. This allows log analysis to continue even after the CVM is shut down.
Configuration
/user_config - Partition on CVM Disk
The user_config partition is intended for user-specific configurations that could change the workload behavior. This partition is exposed to the host and can be changed outside the CVM.
For example, in FLARE applications, each site will have local configurations specific to the site, such as privacy policies or authentication configurations.
User Data Volume Configuration
User data is provided via an unencrypted drive image (user_data.qcow2) mounted at /user_data. Users can copy required data onto this drive before launching the CVM.
NFS Mount Support
For remote data access, NFS mounts are supported. The CVM will automatically mount an NFS volume if an ext_mount.conf file is present in /user_data with the following format:
$NFS_SERVER_NAME_or_IP:$EXPORT_DIR
Example:
172.31.53.113:/var/tmp/nfs_export
The NFS export will be mounted to /user_data/mnt using:
sudo mount -t nfs -o resvport $NFS_EXPORT /user_data/mnt
Note
If NAT is used in the network path, configure the NFS export as insecure:
/training_data *(rw,sync,no_subtree_check,insecure)
Access and Network Security
CVM Lockdown
The CVM is designed with comprehensive access restrictions to prevent unauthorized entry and manipulation:
Administrative Access
The system is configured to be admin-less by removing all users from the sudoers file
OS-level login is disabled entirely
SSH (sshd) is disabled
Serial console access is disabled
Network Restrictions
All network connections are authenticated and encrypted using TLS for secure communication with attestation services and application endpoints.
A strict firewall policy is enforced using iptables with whitelist-based port control for both inbound and outbound traffic:
Default Policy: All inbound and outbound ports are blocked
Inbound Whitelist: Only explicitly allowed ports for:
Application communication (e.g., FLARE server accepting client connections)
Outbound Whitelist: Only explicitly allowed ports for:
DNS resolution
Attestation services communication
Application server connections (e.g., FLARE client to server)
Experiment tracking services (e.g., MLflow)
Management or monitoring services (if configured)
This defense-in-depth approach ensures that even if an attacker gains host-level access, they cannot log in, connect remotely, or communicate through unauthorized network channels.
Application Level Security
In addition to the basic CVM Security, we also need additional security at application level. This might be different for different type of applications.
General Security Measure
- For all applications, we need the following additional security measures:
- Attestation service agent:
Perform the self-attestation using both CPU and GPU attestation service at start.
Boot level attestation is only for CPU, we need to attest GPU as well.
Perform periodical self-tests to make sure the system is not compromised.
- Code Level security:
No dynamic code changes.
FLARE-Specific Security
Federated Learning Provision Process
Federated learning provision is a process to prepare the software packages (FLARE’s startup kits) for each participating organization. Clients and the server will obtain different startup kits. The package is prepared by the system owned by the project admin and then distributed to each participant. Then, FL Server needs to start first, FL Client site will start the startup kit, connect to FL server.
There are three distinguished phases:
Provision processes – prepare the software artifacts (the startup kits).
Distribution process – software packages are distributed to participants.
Run-time processes – At each participant’s host machine, the participant deploys the package, starts the FL system, and establishes the communication between the FL server and the participant.
Terminology
To simplify discussions, we define the following roles:
Project Admin: The individual responsible for initiating and managing the overall project. This includes approving participants, provisioning resources, and triggering the Confidential VM (CVM) build process.
Model Owner: The entity (person or organization) that owns both the pre-trained model and the final trained model. They are primarily concerned with protecting the intellectual property of the model.
Data Owner: The entity that owns the private data used in training. Data privacy and security are their primary concerns.
Org Admin: An IT administrator from a participating organization. This person is responsible for setting up the local environment and launching the site-specific Federated Learning (FL) system instance (e.g., the FL client).
The Process
Provision Process: The generated CVM image will be a lockdown with no access. This is done via additional hardened security measures described above.
Distribution process: For CLI based provision, we will let customers decide the best way to distribute the CVM image file.
Deploy/start: The participant, deployed the CVM image to a CC-enabled Host, add NFS data volume need for the training, run start scripts to start the system.
Note
FLARE Dashboard Support In current release, FLARE Dashboard provision is not supported for CVM provision.
FLARE Attestation Verification
FLARE’s CC manager performs three different attestations:
Self-attestation
Cross-verification among client and server
Periodical cross-verification
FLARE Workload Execution and Access Control Policies
All training and inference code must be pre-reviewed and approved before inclusion in the workload.
The application and its dependencies are pre-installed in the workload docker.
Job execution is triggered by submitting a predefined job configuration—no dynamic or custom or user-supplied code is allowed at runtime.
For IP Protection Use Cases
Only the Project Admin is authorized to download results, including the global model and logs.
Download permissions are disabled for all other users and cannot be overridden at the individual site level.
References
NVIDIA Deployment Guide for SecureAI: https://docs.nvidia.com/cc-deployment-guide-tdx.pdf
RATS architecture: https://www.rfc-editor.org/rfc/rfc9334.html
Google Confidential Space Security Overview: https://cloud.google.com/docs/security/confidential-space
Confidential containers trustee attestation service solution overview and use cases https://www.redhat.com/en/blog/introducing-confidential-containers-trustee-attestation-services-solution-overview-and-use-cases
Confidential Container Trustee: https://github.com/confidential-containers/trustee
Azure confidential computing: harden the linux image to remove sudo users: https://learn.microsoft.com/en-us/azure/confidential-computing/harden-the-linux-image-to-remove-sudo-users
Microsoft Secure the Windows boot process. https://learn.microsoft.com/en-us/windows/security/operating-system-security/system-security/secure-the-windows-10-boot-process
Microsoft Secure Boot. Note these links to the above article. - https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-secure-boot
SEV-SNP measurement tool: https://github.com/virtee/sev-snp-measure