2024-03-20 15:58:50 +00:00
|
|
|
---
|
|
|
|
title: Building a large scale multi-cloud multi-region SaaS platform with kubernetes controllers
|
|
|
|
weight: 8
|
2024-03-25 12:45:10 +00:00
|
|
|
tags:
|
|
|
|
- platform
|
|
|
|
- operator
|
|
|
|
- scaling
|
2024-03-20 15:58:50 +00:00
|
|
|
---
|
|
|
|
|
2024-03-26 14:43:47 +00:00
|
|
|
{{% button href="https://youtu.be/VhloarnpxVo" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}}
|
|
|
|
|
2024-03-20 15:58:50 +00:00
|
|
|
> Interchangeable wording in this talk: controller == operator
|
|
|
|
|
|
|
|
A talk by elastic.
|
|
|
|
|
|
|
|
## About elastic
|
|
|
|
|
2024-03-26 14:00:48 +00:00
|
|
|
* Elastic cloud as a managed service
|
2024-03-20 15:58:50 +00:00
|
|
|
* Deployed across AWS/GCP/Azure in over 50 regions
|
2024-03-26 14:00:48 +00:00
|
|
|
* 600000+ Containers
|
2024-03-20 15:58:50 +00:00
|
|
|
|
|
|
|
### Elastic and Kube
|
|
|
|
|
2024-03-26 14:00:48 +00:00
|
|
|
* They offer elastic observability
|
2024-03-20 15:58:50 +00:00
|
|
|
* They offer the ECK operator for simplified deployments
|
|
|
|
|
|
|
|
## The baseline
|
|
|
|
|
2024-03-26 14:00:48 +00:00
|
|
|
* Goal: A large scale (1M+ containers) resilient platform on k8s
|
2024-03-20 15:58:50 +00:00
|
|
|
* Architecture
|
2024-03-26 14:00:48 +00:00
|
|
|
* Global Control: The control plane (API) for users with controllers
|
|
|
|
* Regional Apps: The "shitload" of Kubernetes clusters where the actual customer services live
|
2024-03-20 15:58:50 +00:00
|
|
|
|
|
|
|
## Scalability
|
|
|
|
|
|
|
|
* Challenge: How large can our cluster be, how many clusters do we need
|
|
|
|
* Problem: Only basic guidelines exist for that
|
2024-03-26 14:00:48 +00:00
|
|
|
* Decision: Horizontally scale the number of clusters (5ßß-1K nodes each)
|
2024-03-20 15:58:50 +00:00
|
|
|
* Decision: Disposable clusters
|
|
|
|
* Throw away without data loss
|
2024-03-26 14:00:48 +00:00
|
|
|
* Single source of truth is not cluster etcd but external -> No etcd backups needed
|
2024-03-20 15:58:50 +00:00
|
|
|
* Everything can be recreated any time
|
|
|
|
|
|
|
|
## Controllers
|
|
|
|
|
|
|
|
{{% notice style="note" %}}
|
2024-03-26 14:00:48 +00:00
|
|
|
I won't copy the explanations of operators/controllers in these notes
|
2024-03-20 15:58:50 +00:00
|
|
|
{{% /notice %}}
|
|
|
|
|
2024-03-26 14:00:48 +00:00
|
|
|
* Many controllers, including (but not limited to)
|
|
|
|
* cluster controller: Register cluster to controller
|
2024-03-20 15:58:50 +00:00
|
|
|
* Project controller: Schedule user's project to cluster
|
|
|
|
* Product controllers (Elasticsearch, Kibana, etc.)
|
2024-03-26 14:00:48 +00:00
|
|
|
* Ingress/Cert manager
|
2024-03-20 15:58:50 +00:00
|
|
|
* Sometimes controllers depend on controllers -> potential complexity
|
|
|
|
* Pro:
|
2024-03-26 14:00:48 +00:00
|
|
|
* Resilient (Self-healing)
|
2024-03-20 15:58:50 +00:00
|
|
|
* Level triggered (desired state vs procedure triggered)
|
|
|
|
* Simple reasoning when comparing desired state vs state machine
|
|
|
|
* Official controller runtime lib
|
2024-03-26 14:00:48 +00:00
|
|
|
* Workqueue: Automatic Dedup, Retry back off and so on
|
2024-03-20 15:58:50 +00:00
|
|
|
|
|
|
|
## Global Controllers
|
|
|
|
|
|
|
|
* Basic operation
|
|
|
|
* Uses project config from Elastic cloud as the desired state
|
2024-03-26 14:00:48 +00:00
|
|
|
* The actual state is a k9s resource in another cluster
|
|
|
|
* Challenge: Where is the source of truth if the data is not stored in etcd
|
|
|
|
* Solution: External data store (Postgres)
|
|
|
|
* Challenge: How do we sync the db sources to Kubernetes
|
2024-03-20 15:58:50 +00:00
|
|
|
* Potential solutions: Replace etcd with the external db
|
|
|
|
* Chosen solution:
|
2024-03-26 14:00:48 +00:00
|
|
|
* The controllers don't use CRDs for storage, but they expose a web-API
|
|
|
|
* Reconciliation still now interacts with the external db and go channels (queue) instead
|
2024-03-20 15:58:50 +00:00
|
|
|
* Then the CRs for the operators get created by the global controller
|
|
|
|
|
|
|
|
### Large scale
|
|
|
|
|
|
|
|
* Problem: Reconcile gets triggered for all objects on restart -> Make sure nothing gets missed and is used with the latest controller version
|
|
|
|
* Idea: Just create more workers for 100K+ Objects
|
|
|
|
* Problem: CPU go brrr and db gets overloaded
|
|
|
|
* Problem: If you create an item during restart, suddenly it is at the end of a 100Kü item work-queue
|
|
|
|
|
|
|
|
### Reconcile
|
|
|
|
|
|
|
|
* User-driven events are processed asap
|
2024-03-26 14:00:48 +00:00
|
|
|
* reconcile of everything should happen, bus with low priority slowly in the background
|
|
|
|
* Solution: Status: LastReconciledRevision (timestamp) gets compare to revision, if larger -> User change
|
|
|
|
* Prioritization: Just a custom event handler with the normal queue and a low priority
|
|
|
|
* Queue: Just a queue that adds items to the normal work-queue with a rate limit
|
2024-03-20 15:58:50 +00:00
|
|
|
|
|
|
|
```mermaid
|
|
|
|
flowchart LR
|
|
|
|
low-->rl(ratelimit)
|
|
|
|
rl-->wq(work queue)
|
|
|
|
wq-->controller
|
|
|
|
high-->wq
|
|
|
|
```
|
|
|
|
|
|
|
|
## Related
|
|
|
|
|
|
|
|
* Argo for CI/CD
|
|
|
|
* Crossplane for cluster autoprovision
|