---
title: "The Hitchhiker's Guide to Kubernetes Platforms: Don’t Panic, Just Launch!"
weight: 7
tags:
 - platform
 - scaling
 - operators
 - dx
---

This talk looks at bootstrapping Platforms using KServe.
They do this in regard to AI Workflows.

## Scenario

* Deploy AI Workloads - Sometime consisting of different parts
* Models get stored in a model registry

## Baseline

* Consistent APIs throughout the platform
* Not the kube API directly b/c:
  * Data scientists are a bit overpowered by the kube API
  * Not only Kubernetes (also monitoring tools, feedback tools, etc.)
  * Better debugging experience for specific workloads

## The debugging API

* Specific API with enhanced statuses and consistent UX across Code and UI
* Example Endpoints: Pods, Deployments, InferenceServices
* Provides a status summary-> Consistent health info across all related resources
  * Example: Deployments have progress/availability, Pods have phases, Containers have readiness -> What do we interpret how?
  * Evaluation: Progressing, Available Count vs Readiness, Replicafailure, Pod Phase, Container Readiness
* The rules themselves may be pretty complex, but - since the user doesn't have to check them themselves - the status is simple

### Debugging Metrics

* Dashboards (Utilization, throughput, latency)
* Events
* Logs

## Deployment API

* Launchpad: Just select your model and version -> The DB (dock) stores all manifests (Spaceship)
* Manifests relate to models from a model registry
* Multi-tenancy is implemented using k8s namespaces
* Kine is used to replace/extend etcd with the relational dock db -> Relation namespace<->manifests is stored here and RBAC can be used
* Launchpad: Select Namespace and check resource (fuel) availability/utilization

### Cluster maintenance

* Deployments can be launched to multiple clusters (even two clusters at once) -> HA through identical clusters
* The exact same manifests get deployed to two clusters
* Cluster desired state is stored externally to enable effortless upgrades, rescale, etc

### Versioning API

* Basically the dock DB
* CRDs are the representations of the inference manifests
* Rollbacks, Promotion and History is managed via the CRs
* Why not GitOps: Internal Diffs, deployment overrides, customized features

### UX

* User driven API design
* Customized tools
* Everything gets 1:1 replicated for HA
* Large onboarding guide