docs(day-2): Chaos Engineering talk notes

This commit is contained in:
2026-03-21 15:30:57 +01:00
parent 8ce0ccda0d
commit cdf7163a27
2 changed files with 51 additions and 0 deletions

View File

@@ -0,0 +1,50 @@
---
title: "How Chaos-Engineering works: Implementing Failure Injection on Kubernetes with Rust"
weight: 7
tags:
- rejekts
- chaos
- rust
- operator
---
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
{{% button href="https://github.com/ioboi/kerris" style="info" icon="code" %}}Code/Demo{{% /button %}}
A general introduction to chaos engineering with specificly showing implementations by Chaos Mesh and Litmus.
After that the talk continues into the implementation of a custom chaos operator, written in rust.
## Chaos Engineering
> Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the systems capability to withstand turbulent conditions in production
> ~ Chaos Community (2015)
- CNCF-Projects: **Chaos Mesh**, **Litmus**, Chaos Blade, Krkn
- Types: **Pod Chaos** (Delete/Terminate), **Pod Network Chaos** (Faults/Packet loss), Node Chaos, JVM Chaos, Infra Chaos(Reboot vms), ...
## APIs
- Chaos Mesh: A Specific CRD per Chaos Type (PodChaos, NetworkChaos, ...)
- Litmus: Chaos Engine Config that defines the type in it's spec
TODO: Steal sample CRDs
## DIY
- Baseline: Written as a controller in Rust (out of curiosity) with support for Pod Chaos and Network Chaos
- Entrypoint is a reconcile function that returns an action (requeue, etc) and an error
- Network Chaos uses traffic controll (`tc` part of `iproute2`) to do the limiting and loss
- If you're interesten in the rust implementation: Look at the code linked above
```mermaid
graph LRT
controller
subgraph node
daemon
containerd
daemon-->|grpc|containerd
end
controller-->daemon
```

View File

@@ -17,6 +17,7 @@ I have to admit that I'm very bad with names and don't always regocnize people b
## Talk recommendations
- If you're building operators: [Solving Operator Extensibility: A gRPC Plugin Framework for kubernetes](./04_operator-estensibility)
- [Intro to both chaos engineering and building operators that interact with containerd in rust](./07-chaosengineering)
- The idea behind [The self-improving platform: Closing the Loop Between Telemetry and Tuning](./05_selvimproving) is very interesting but the first half of the talk is kinda confusing as it discusses a study that could have been shortened drasticly. But the way they automaticly create PRs for resource utilizations is cool
## Other stuff I learned or people i talk to