--- title: "The Hitchhiker's Guide to Kubernetes Platforms: Don’t Panic, Just Launch!" weight: 7 tags: - platform - scaling - operators - dx --- This talk looks at bootstrapping Platforms using KServe. They do this in regard to AI Workflows. ## Scenario * Deploy AI Workloads - Sometime consisting of different parts * Models get stored in a model registry ## Baseline * Consistent APIs throughout the platform * Not the kube API directly b/c: * Data scientists are a bit overpowered by the kube API * Not only Kubernetes (also monitoring tools, feedback tools, etc.) * Better debugging experience for specific workloads ## The debugging API * Specific API with enhanced statuses and consistent UX across Code and UI * Example Endpoints: Pods, Deployments, InferenceServices * Provides a status summary-> Consistent health info across all related resources * Example: Deployments have progress/availability, Pods have phases, Containers have readiness -> What do we interpret how? * Evaluation: Progressing, Available Count vs Readiness, Replicafailure, Pod Phase, Container Readiness * The rules themselves may be pretty complex, but - since the user doesn't have to check them themselves - the status is simple ### Debugging Metrics * Dashboards (Utilization, throughput, latency) * Events * Logs ## Deployment API * Launchpad: Just select your model and version -> The DB (dock) stores all manifests (Spaceship) * Manifests relate to models from a model registry * Multi-tenancy is implemented using k8s namespaces * Kine is used to replace/extend etcd with the relational dock db -> Relation namespace<->manifests is stored here and RBAC can be used * Launchpad: Select Namespace and check resource (fuel) availability/utilization ### Cluster maintenance * Deployments can be launched to multiple clusters (even two clusters at once) -> HA through identical clusters * The exact same manifests get deployed to two clusters * Cluster desired state is stored externally to enable effortless upgrades, rescale, etc ### Versioning API * Basically the dock DB * CRDs are the representations of the inference manifests * Rollbacks, Promotion and History is managed via the CRs * Why not GitOps: Internal Diffs, deployment overrides, customized features ### UX * User driven API design * Customized tools * Everything gets 1:1 replicated for HA * Large onboarding guide