---
title: "The auto-scaling part: VPA, HPQ, KEDA, Nodes, How do they dance"
weight: 10
tags:
 - rejekts
---

{{% button href="https://www.youtube.com/watch?v=1US_-3udMDo" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}}
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->

## Hypothesis

- In 2024 27% of cloud spent was wasted
- 100ms delay => decrease in sales

## Pod resources

- Requests: Informs scheduler's decision
    - Too low: Schedule on strained nodes
    - Too high: Wasted resources
- Limits: Throttels (CPU) or Kills (Memory) if reached
- QoS: sort the eviction priority during ressource pressure
    - Quranteed (request=limits)
    - Burstable (Limits>Requests)
    - Best effort (Nothing defined) 
- Gotcha: CPU throtteling can happen before tirggers happen if requests and limits are very close

TODO: Steal table from Slides

Requests | 100m, 256Mi | 100m, 256Mi
Limits |100m, 256Mi | None or <limits
QoS | Gurantee | Burstable | Best effort

## Scalers

- VPA: Moar power aka reccomend requests
- HPA: Moar moar aka more replicas
- KEDA: Proxy over HPA

### VPA

Modes:
- Off: Dry-Run
- Initial: Applies Reccomendations to new Pods (can be used for finding out)
- Auto/Recreate: Evicts and restarts pods to update resources

Trigger: Usually Memory
Tip: `maxAllowed` in order to not exhaust stuff


### HPA

- Trigger: Usually cpu (percent of requests)
- Formula: $1+\frac{usage}{target}$
- Fun fact: Can not scale to 0

### KeDA

- Basicly automates HPA with flexible metrics (from different soruces)
- Can scale Jobs
- Can Scale to 0

## Anti patterns

TODO: Steal from slides

| Pattern | Bad | Better
| CPI limit = Requests | Throtteling before scale | Set requests only |


## Demo

Auto scaling meme generator (see slides/video)