--- title: "From us to ms: Pushing Kubernetes Workloads to the Limit" weight: 7 tags: - rejekts - performance --- {{% button href="https://www.youtube.com/watch?v=EYipC5y-8rM" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} There were more details in the talk than I copied into these notes. Most of them were just too much to write down or application specific. ## Why? - We need it (Product requirements) - Cost efficiency ## Cross Provider Networking - Throughput: - Same-Zone 200GB/s - Cross-Zone 5-10% Pemnalty - Latency: - Same Zone P99: 0.95ms - Cross zone P99: 1.95ms - Result: Encourage Services to allways router in the same zone if possible - How: - Topology-Aware-Routing (older, a bit buggy) - `trafficDistribution: PreferClose`: Routes to same zone if possible (needs cni-support) - Setup the stack one in each zone - Measurements: Kubezonnet can detect cross-zone-traffic ## Disk latency - Baseline 660MiB/s per SSD aka ~1 SSD per 5GBit/s Networking - Example: 100Gbps needs a RAID0 with a bunch of SSDs ```mermaid graph LR Querier-->|125ms|Cache Cache-->|200ms|S3 direction TB Cache<-->SSD ``` ## Memory managment - Garbage Collection takes time and is a throughput for latency trade-off - Idea: Avoid allocations - Preallocate (e.g. Arenas) - Allocation reuse (e.g. in grpc) - "Allocation Schemes" (thread per core) - Avoid memory pressure by - Using gc-friendly types - Tuning your GC - Idea: Implement your own optimized data structure ## Optimization in Kubernates ### Defaults - Best efford - No protection from consuming all node memory - Critical services could get scheduled on the same node ### Requests and limits - Requests: Needed to be scheduled - Limits: Kill if exceeded - Problem: Reactive, it just checks pods according to a cronjob (can be set as apiflag but has a minimum) - Downward-API: You can reference the limits in your applications (to let the app trigger gc before the pod gets killed) ### Tains and tolerations - Pin your workload basted on labels and annotations ### Static cpu manager - Request a whole number of CPUs -> You get this core guranteed