Some checks failed
Build latest image / build-container (push) Failing after 50s
50 lines
1.2 KiB
Markdown
50 lines
1.2 KiB
Markdown
---
|
|
title: "Perfomance preseverance: Taming 1000 kubernetes clusters"
|
|
weight: 12
|
|
tags:
|
|
- platform
|
|
- cloudnativecon
|
|
---
|
|
|
|
{{% button href="https://youtu.be/ZTT8M74RD1M" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}}
|
|
{{% button href="https://static.sched.com/hosted_files/colocatedeventseu2025/d5/kubecon_2025_v4.2.pdf" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}}
|
|
|
|
## History
|
|
|
|
- They started with upstream kubernetes - the hard way
|
|
- Env grew to over 200 prod apps
|
|
- Pains: Single Cluster, single point of failure and complexity
|
|
- What worked: Dev adoption and autonomy, no vendor
|
|
|
|
## Challenges
|
|
|
|
> Based on stakeholder expectations
|
|
|
|
- One tenant per cluster -> Over 1000 Clusters
|
|
- Release management
|
|
- Small team (3 Engineers)
|
|
|
|
## Guiding principles
|
|
|
|
- Platform as a product
|
|
- Stability: trust
|
|
- Standardization -> Scalability and inter team collab
|
|
- Day 2 support
|
|
- Dogfooding
|
|
|
|
## Tenancy
|
|
|
|
- One cluster per product
|
|
- Own CLI, devs like cli
|
|
- Custom operator and crds
|
|
|
|
## Stack
|
|
|
|
- Keopsctl? Pretty much their own cluster operator
|
|
- A Simple Cluster CRD
|
|
|
|
## Migration
|
|
|
|
1. Build trust in platform
|
|
2. Support with docs, oboarding, q&a
|
|
3. Co-create with devs while keeping an eye on day2 -> Feature-Flag based rollout |