From cdf7163a270f3bbf3739b42afe724f005f24b033 Mon Sep 17 00:00:00 2001 From: Nicolai Ort Date: Sat, 21 Mar 2026 15:30:57 +0100 Subject: [PATCH] docs(day-2): Chaos Engineering talk notes --- content/day-2/07-chaosengineering.md | 50 ++++++++++++++++++++++++++++ content/day-2/_index.md | 1 + 2 files changed, 51 insertions(+) create mode 100644 content/day-2/07-chaosengineering.md diff --git a/content/day-2/07-chaosengineering.md b/content/day-2/07-chaosengineering.md new file mode 100644 index 0000000..f6baf55 --- /dev/null +++ b/content/day-2/07-chaosengineering.md @@ -0,0 +1,50 @@ +--- +title: "How Chaos-Engineering works: Implementing Failure Injection on Kubernetes with Rust" +weight: 7 +tags: + - rejekts + - chaos + - rust + - operator +--- + + + +{{% button href="https://github.com/ioboi/kerris" style="info" icon="code" %}}Code/Demo{{% /button %}} + +A general introduction to chaos engineering with specificly showing implementations by Chaos Mesh and Litmus. +After that the talk continues into the implementation of a custom chaos operator, written in rust. + +## Chaos Engineering + +> Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the systems capability to withstand turbulent conditions in production +> ~ Chaos Community (2015) + +- CNCF-Projects: **Chaos Mesh**, **Litmus**, Chaos Blade, Krkn +- Types: **Pod Chaos** (Delete/Terminate), **Pod Network Chaos** (Faults/Packet loss), Node Chaos, JVM Chaos, Infra Chaos(Reboot vms), ... + +## APIs + +- Chaos Mesh: A Specific CRD per Chaos Type (PodChaos, NetworkChaos, ...) +- Litmus: Chaos Engine Config that defines the type in it's spec + +TODO: Steal sample CRDs + +## DIY + +- Baseline: Written as a controller in Rust (out of curiosity) with support for Pod Chaos and Network Chaos +- Entrypoint is a reconcile function that returns an action (requeue, etc) and an error +- Network Chaos uses traffic controll (`tc` part of `iproute2`) to do the limiting and loss +- If you're interesten in the rust implementation: Look at the code linked above + +```mermaid +graph LRT + controller + subgraph node + daemon + containerd + + daemon-->|grpc|containerd + end + controller-->daemon +``` \ No newline at end of file diff --git a/content/day-2/_index.md b/content/day-2/_index.md index 3e9a5f1..754e834 100644 --- a/content/day-2/_index.md +++ b/content/day-2/_index.md @@ -17,6 +17,7 @@ I have to admit that I'm very bad with names and don't always regocnize people b ## Talk recommendations - If you're building operators: [Solving Operator Extensibility: A gRPC Plugin Framework for kubernetes](./04_operator-estensibility) +- [Intro to both chaos engineering and building operators that interact with containerd in rust](./07-chaosengineering) - The idea behind [The self-improving platform: Closing the Loop Between Telemetry and Tuning](./05_selvimproving) is very interesting but the first half of the talk is kinda confusing as it discusses a study that could have been shortened drasticly. But the way they automaticly create PRs for resource utilizations is cool ## Other stuff I learned or people i talk to