---
title: Building a Confidential AI Inference Platform on Kubernetes
weight: 9
tags:
 - security
 - ai
---

<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->

> Felt a bit like a showcase of their product's architecture - not bad, just nothing really to take home

Backgrund: How do we protect the data flowing into and out of our ai models?

## Goals

- Cloud based interference api
- E2E Encryption
- E2E Attestation

## Encryption Mechanisms

- Idea: Combine data at rest with data in transit and data in use encryption (encrypted memory)
- Attestation: CPU has a private key and issues certificates

## Confidential Containers

- Traditional: Full VM-based isolation
- Kubernetes: Advanced contaoiner isolation using virtual sockets and much more
- Implementation: Frameworks like contrast

### Threat model

- Isolated: Container
- Shared: Kubernetes, Hypervisor, Cloud Infra, Hardware

### Architecture

```mermaid
graph LR
User
User-->|Accesses with trust|AICode
User-->|Key exchange|SecretService-->|Key exchange|AICode
Manifest-->|Configure|ContrastCoordinator
subgraph Cluster
    ContrastCoordinator(Contrast Coordinator)
    ContrastCoordinator-->|Verify|Worker
    subgraph Worker
        AICode(AI Code)
        AttestationAgent
    end
    AICode-->|Accesses|GPU
    AttestationAgent-->|Verify|GPU
    SecretService
end
ContrastCoordinator-->|Attest|User
```