All checks were successful
Build latest image / build-container (push) Successful in 53s
81 lines
2.3 KiB
Markdown
81 lines
2.3 KiB
Markdown
---
|
|
title: "From us to ms: Pushing Kubernetes Workloads to the Limit"
|
|
weight: 7
|
|
tags:
|
|
- rejekts
|
|
- performance
|
|
---
|
|
|
|
{{% button href="https://www.youtube.com/watch?v=EYipC5y-8rM" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}}
|
|
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
|
|
|
There were more details in the talk than I copied into these notes.
|
|
Most of them were just too much to write down or application specific.
|
|
|
|
## Why?
|
|
|
|
- We need it (Product requirements)
|
|
- Cost efficiency
|
|
|
|
## Cross Provider Networking
|
|
|
|
- Throughput:
|
|
- Same-Zone 200GB/s
|
|
- Cross-Zone 5-10% Pemnalty
|
|
- Latency:
|
|
- Same Zone P99: 0.95ms
|
|
- Cross zone P99: 1.95ms
|
|
- Result: Encourage Services to allways router in the same zone if possible
|
|
- How:
|
|
- Topology-Aware-Routing (older, a bit buggy)
|
|
- `trafficDistribution: PreferClose`: Routes to same zone if possible (needs cni-support)
|
|
- Setup the stack one in each zone
|
|
- Measurements: Kubezonnet can detect cross-zone-traffic
|
|
|
|
## Disk latency
|
|
|
|
- Baseline 660MiB/s per SSD aka ~1 SSD per 5GBit/s Networking
|
|
- Example: 100Gbps needs a RAID0 with a bunch of SSDs
|
|
|
|
```mermaid
|
|
graph LR
|
|
Querier-->|125ms|Cache
|
|
Cache-->|200ms|S3
|
|
direction TB
|
|
Cache<-->SSD
|
|
```
|
|
|
|
## Memory managment
|
|
|
|
- Garbage Collection takes time and is a throughput for latency trade-off
|
|
- Idea: Avoid allocations
|
|
- Preallocate (e.g. Arenas)
|
|
- Allocation reuse (e.g. in grpc)
|
|
- "Allocation Schemes" (thread per core)
|
|
- Avoid memory pressure by
|
|
- Using gc-friendly types
|
|
- Tuning your GC
|
|
- Idea: Implement your own optimized data structure
|
|
|
|
## Optimization in Kubernates
|
|
|
|
### Defaults
|
|
|
|
- Best efford
|
|
- No protection from consuming all node memory
|
|
- Critical services could get scheduled on the same node
|
|
|
|
### Requests and limits
|
|
|
|
- Requests: Needed to be scheduled
|
|
- Limits: Kill if exceeded
|
|
- Problem: Reactive, it just checks pods according to a cronjob (can be set as apiflag but has a minimum)
|
|
- Downward-API: You can reference the limits in your applications (to let the app trigger gc before the pod gets killed)
|
|
|
|
### Tains and tolerations
|
|
|
|
- Pin your workload basted on labels and annotations
|
|
|
|
### Static cpu manager
|
|
|
|
- Request a whole number of CPUs -> You get this core guranteed |