Files
kubecon26/content/day-2/07-chaosengineering.md

50 lines
1.8 KiB
Markdown

---
title: "How Chaos-Engineering works: Implementing Failure Injection on Kubernetes with Rust"
weight: 7
tags:
- rejekts
- chaos
- rust
- operator
---
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
{{% button href="https://github.com/ioboi/kerris" style="info" icon="code" %}}Code/Demo{{% /button %}}
A general introduction to chaos engineering with specificly showing implementations by Chaos Mesh and Litmus.
After that the talk continues into the implementation of a custom chaos operator, written in rust.
## Chaos Engineering
> Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the systems capability to withstand turbulent conditions in production
> ~ Chaos Community (2015)
- CNCF-Projects: **Chaos Mesh**, **Litmus**, Chaos Blade, Krkn
- Types: **Pod Chaos** (Delete/Terminate), **Pod Network Chaos** (Faults/Packet loss), Node Chaos, JVM Chaos, Infra Chaos(Reboot vms), ...
## APIs
- Chaos Mesh: A Specific CRD per Chaos Type (PodChaos, NetworkChaos, ...)
- Litmus: Chaos Engine Config that defines the type in it's spec
TODO: Steal sample CRDs
## DIY
- Baseline: Written as a controller in Rust (out of curiosity) with support for Pod Chaos and Network Chaos
- Entrypoint is a reconcile function that returns an action (requeue, etc) and an error
- Network Chaos uses traffic controll (`tc` part of `iproute2`) to do the limiting and loss
- If you're interesten in the rust implementation: Look at the code linked above
```mermaid
graph LRT
controller
subgraph node
daemon
containerd
daemon-->|grpc|containerd
end
controller-->daemon
```