kubecon25/12_many-clusters.md at 0e24bf4fd695cc0865a0af60f18806e9b80814af - kubecon25 - ODIT.Services

niggl/kubecon25

Nicolai Ort 0e24bf4fd6

Build latest image / build-container (push) Failing after 50s

Details

docs: Added youtube links

2025-05-07 07:07:48 +02:00

1.2 KiB

Raw Blame History

title, weight, tags

title

weight

tags

Perfomance preseverance: Taming 1000 kubernetes clusters

12

platform

cloudnativecon

{{% button href="https://youtu.be/ZTT8M74RD1M" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} {{% button href="https://static.sched.com/hosted_files/colocatedeventseu2025/d5/kubecon_2025_v4.2.pdf" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}}

History

They started with upstream kubernetes - the hard way
Env grew to over 200 prod apps
Pains: Single Cluster, single point of failure and complexity
What worked: Dev adoption and autonomy, no vendor

Challenges

Based on stakeholder expectations

One tenant per cluster -> Over 1000 Clusters
Release management
Small team (3 Engineers)

Guiding principles

Platform as a product
Stability: trust
Standardization -> Scalability and inter team collab
Day 2 support
Dogfooding

Tenancy

One cluster per product
Own CLI, devs like cli
Custom operator and crds

Stack

Keopsctl? Pretty much their own cluster operator
A Simple Cluster CRD

Migration

Build trust in platform
Support with docs, oboarding, q&a
Co-create with devs while keeping an eye on day2 -> Feature-Flag based rollout