65 lines
2.4 KiB
Markdown
65 lines
2.4 KiB
Markdown
---
|
|
title: "Yor Cluster Isn't flat: A First-Class API for Real-World Infrastructure Topology"
|
|
weight: 9
|
|
tags:
|
|
- rejekts
|
|
---
|
|
|
|
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
|
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
|
<!-- {{% button href="https://github.com/JesseStutler" style="info" icon="code" %}}Code/Demo{{% /button %}} -->
|
|
<!-- {{% button href="https://cloudnativeplatforms.com" style="info" icon="link" %}}Website/Homepage{{% /button %}} -->
|
|
|
|
By a volcano maintainer from Huawei - a very wholesome guy.
|
|
I don't know why the organizers always tend to schedule these very technical topic by people with a bit of an harder accent (totally understandable but very quiet) near the end of the conference or day? I thank the Sakura Edition Red Bull for keeping my attention span up and running for the last two sessions of the day.
|
|
|
|
## History of vokcano
|
|
|
|
- 2017: Kube-Batch open soruce
|
|
- 2019: Volcano Open Source
|
|
- 2020: CNCF Sandbox
|
|
- 2022: CNCF Incubation
|
|
- 2026: Road to graduation
|
|
|
|
## Volcano feature overview
|
|
|
|
- Unified Scheduler
|
|
- Queue Management
|
|
- Workload Colocation
|
|
- Multi cluster scheduling
|
|
- Heterogenus Device Support
|
|
- Multiple Scheduling policies
|
|
|
|
## Why topology awareness?
|
|
|
|
- Scenario 1: Bottlenecks in LLM-Training when jobns are not placed on GPUs that are close
|
|
- Scenario 2: Inference runs as Seperate Prefill and Decode Jobs on different hardware -> Short network hops needed
|
|
- Node labels can be used but are very limited
|
|
- Datacenter network architectures are heterogenus -> Everyone can buil in their own style
|
|
|
|
## Scheduler notation mechansis
|
|
|
|
- Label: Kueue, Koordinator, KAI Scheduler
|
|
- Vendor-Specific Syntax
|
|
- No hierarchy
|
|
- Need to be manually set
|
|
- No healthchecks
|
|
- Cloud Specific
|
|
- CRD (Long term): Volcano
|
|
- Standardized API (HyperNBode)
|
|
- Hierarchical (Trees/Zones)
|
|
- Auto-discovery - Plugin-Ready (e.G. NVIDIA)
|
|
- Healhchecks
|
|
- Unified across clouds and on-prem
|
|
|
|
## Architecture CRD Sample
|
|
|
|
TODO: Steal Leaf sample from slides
|
|
|
|
|
|
## What's next
|
|
|
|
- GPU 3D Architectures (Internal interconnects, NUMA, external interconnects)
|
|
- DRA integration/collabaration
|
|
- Promotion of HyperNode to a first-class citizen -> Extraction from Volcano to be truly generic
|