kubecon25/content/day1/03_operator-mistakes.md
Nicolai Ort 0e24bf4fd6
Some checks failed
Build latest image / build-container (push) Failing after 50s
docs: Added youtube links
2025-05-07 07:07:48 +02:00

79 lines
3.4 KiB
Markdown

---
title: "Don't write controllers like charlie don't does: Avoiding common kubernetes controller mistakes"
weight: 3
tags:
- kubecon
- operator
---
{{% button href="https://youtu.be/tnSraS9JqZ8" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}}
{{% button href="https://static.sched.com/hosted_files/kccnceu2025/53/Don%27t%20write%20controllers%20like%20Charlie%20Don%27t%20does_%20avoiding%20common%20Kubernetes%20controller%20mistakes.pptx.pdf" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}}
## Common mistake
### Not using a simple client but directly talk to the api server
- Problem: A
- Problem: Updates send in the whole object -> Noop updates waste apiserver resources
- Fix: Use a cache client
- Problem: Caching validation
### Don't use custom caching
- Problem: Good Luck dealing with concurrency
- Hard: Controllers mus maintain a per kind cache
- Problem: Eventual consistency makes everything more complicated
- Fix: Use a framework
### Predecates only apply to the current
- If you have a predecate in the for (predecate) only appy to this call, not to other watchers
- Also check if you shold be reconciling your low-level object or reconciling the higher level ones that ref to them is better
## Tools
### KRT
> Still under development
- Operatorions in collections (kubernetes objects with state tracking)
- Fetch function that handels transformation
### StateDB
- In-memory database for go with watch channels
- You can setup a table that stores all objects of a kind (provided by the client)
- Triggers hooks when changes happen in the database that you can react to
### Controller-Runtime
> The kubebuilder one
- Includes a chached client
- Works on the reconciler pattern -> Makes triggers simpe
## Tips
- Limit the number of api server updates
- Check for dif yourself and don't send updates if there is nothing new
- Use patch instead of update just with changed fields -> Especially for `.status`
- Use a framework that handles watching, coalescing and caching (krt, statedb, controller-runtime)
- Use predecates if you're using controller-runtime, this helps you filter out no-op events by checking them against the cache and filters
## Q&A
- Do you know where your reconciliations are coming from:
- Counts: Yes the frameworks provide metrics and you can implement your own
- But controller runtime abstracts the patch source so you have to compare before and after state yourself - but you should not do that
- What about state sharing across multiple threads?
- Controller runtime handels each reconcile as idempotent, so you can just multithread
- But handling consistency can still be hard because you have to design all of your operations as idempotent by rebuilding the state each time
- What are your thoughts on controllers that do stuff in the real world (especially b/c it takes longer and there are no natie observers)
- Do something like the krt project by keeping the state seperatly
- What if someone changes things at the cloud provider
- A question of philosophy -> Usually just treat the operator at the source of throuth
- How do you test your operators?
- Depends on your output (kubernetes objects make stuf simple)
- For cilium: Simple b/c it's just creating kubernetes projects
- With oputside interaction: In-memory state representation or mocking
- For complex controllers split the operator into: Ingestion, data model and transformation