From 6931da118cbe1d8d2572fb39e5d95951c0384246 Mon Sep 17 00:00:00 2001
From: Nicolai Ort <info@nicolai-ort.com>
Date: Sun, 30 Mar 2025 15:35:08 +0200
Subject: [PATCH] docs(day-2): New talk

---
 content/day-2/07_pushing-limits.md | 80 ++++++++++++++++++++++++++++++
 1 file changed, 80 insertions(+)
 create mode 100644 content/day-2/07_pushing-limits.md

diff --git a/content/day-2/07_pushing-limits.md b/content/day-2/07_pushing-limits.md
new file mode 100644
index 0000000..3a90203
--- /dev/null
+++ b/content/day-2/07_pushing-limits.md
@@ -0,0 +1,80 @@
+---
+title: "From us to ms: Pushing Kubernetes Workloads to the Limit"
+weight: 7
+tags:
+ - rejekts
+ - performance
+---
+
+<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
+
+There were more details in the talk than I copied into these notes.
+Most of them were just too much to write down or application specific.
+
+## Why?
+
+- We need it (Product requirements)
+- Cost efficiency
+
+## Cross Provider Networking
+
+- Throughput:
+    - Same-Zone 200GB/s
+    - Cross-Zone 5-10% Pemnalty
+- Latency:
+    - Same Zone P99: 0.95ms
+    - Cross zone P99: 1.95ms
+- Result: Encourage Services to allways router in the same zone if possible
+- How:
+    - Topology-Aware-Routing (older, a bit buggy)
+    - `trafficDistribution: PreferClose`: Routes to same zone if possible (needs cni-support)
+    - Setup the stack one in each zone
+- Measurements: Kubezonnet can detect cross-zone-traffic
+
+## Disk latency
+
+- Baseline 660MiB/s per SSD aka ~1 SSD per 5GBit/s Networking
+- Example: 100Gbps needs a RAID0 with a bunch of SSDs
+
+```mermaid
+graph LR
+    Querier-->|125ms|Cache
+    Cache-->|200ms|S3
+    direction TB
+    Cache<-->SSD
+```
+
+## Memory managment
+
+- Garbage Collection takes time and is a throughput for latency trade-off
+- Idea: Avoid allocations
+    - Preallocate (e.g. Arenas)
+    - Allocation reuse (e.g. in grpc)
+    - "Allocation Schemes" (thread per core)
+- Avoid memory pressure by
+    - Using gc-friendly types
+    - Tuning your GC
+- Idea: Implement your own optimized data structure
+
+## Optimization in Kubernates
+
+### Defaults
+
+- Best efford
+- No protection from consuming all node memory
+- Critical services could get scheduled on the same node
+
+### Requests and limits
+
+- Requests: Needed to be scheduled
+- Limits: Kill if exceeded
+- Problem: Reactive, it just checks pods according to a cronjob (can be set as apiflag but has a minimum)
+- Downward-API: You can reference the limits in your applications (to let the app trigger gc before the pod gets killed)
+
+### Tains and tolerations
+
+- Pin your workload basted on labels and annotations
+
+### Static cpu manager
+
+- Request a whole number of CPUs -> You get this core guranteed
\ No newline at end of file