day 2 keynotes

This commit is contained in:
Nicolai Ort 2024-03-20 10:42:54 +01:00
parent a03f0c0da2
commit 3eb8eacd1d
Signed by: niggl
GPG Key ID: 13AFA55AF62F269F
7 changed files with 209 additions and 0 deletions

View File

@ -0,0 +1,32 @@
---
title: Opening Keynote
weight: 1
---
The opening keynote started - as is the tradition with keynotes - with an "motivational" opening video.
The keynote itself was presented by the CEO of the CNCF.
## The numbers
* Over 2000 attendees
* 10 Years of Kubernetes
* 60% of large organizations expect rapid cost increases due to AI/ML (FinOps Survey)
## The highlights
* Everyone uses cloudnative
* AI uses Kubernetes b/c the UX is way better than classic tools
* Especially when transferring from dev to prod
* We need standardization
* Open source is cool
## Live demo
* KIND cluster on desktop
* Protptype Stack (develop on client)
* Kubernetes with the LLM
* Host with LLVA (image describe model), moondream and OLLAMA (the model manager/registry()
* Prod Stack (All in kube)
* Kubernetes with LLM, LLVA, OLLAMA, moondream
* Available Models: llava, mistral bokllava (llava*mistral)
* Host takes picture, ai describes what is pictures (in our case the conference audience)

View File

@ -0,0 +1,36 @@
---
title: AI KEynote discussion
weight: 2
---
A podium discussion (somewhat scripted) lead by Pryanka
## Guests
* Tim from Mistral
* Paige from Google AI
* Jeff founder of OLLAMA
## Discussion
* What do you use as the base of dev for ollama
* Jeff: The concepts from docker, git, kubernetes
* How is the balance between ai engi and ai ops
* Jeff: The classic dev vs ops devide, many ML-Engi don't think about
* Paige: Yessir
* How does infra keep up with the fast research
* Paige: Well, they don't - but they do their best and Cloudnative is cool
* Jeff: Well we're not google, but kubernetes is the saviour
* What are scaling constraints
* Jeff: Currently sizing of models is still in it's infancy
* Jeff: There will be more specific hardware and someone will have to support it
* Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization)
* Paige: Optimization of smaller models
* What technologies need to be open source licensed
* Jeff: The model b/c access and trust
* Tim: The models and base execution environemtn -> Vendor agnosticism
* Paige: Yes and remixes are really imporant for development
* Anything else
* Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world
* Paige: Currently many people just use paid apis to abstract the infra, but we need this stuff selfhostable
* Tim: I don'T want to know about the hardware, the whole infra side should be done by the cloudnative teams to let ML-Engi to just be ML-Engi

View File

@ -0,0 +1,50 @@
---
title: Accelerating AI workloads with GPUs in kubernetes
weight: 3
---
Kevin and Sanjay from NVIDIA
## Enabeling GPUs in Kubernetes today
* Host level components: Toolkit, drivers
* Kubernetes components: Device plugin, feature discovery, node selector
* NVIDIA humbly brings you a GPU operator
## GPU sharing
* Time slicing: Switch around by time
* Multi Process Service: Run allways on the GPU but share (space-)
* Multi Instance GPU: Space-seperated sharing on the hardware
* Virtual GPU: Virtualices Time slicing or MIG
* CUDA Streams: Run multiple kernels in a single app
## Dynamic resource allocation
* A new alpha feature since Kube 1.26 for dynamic ressource requesting
* You just request a ressource via the API and have fun
* The sharing itself is an implementation detail
## GPU scale out challenges
* NVIDIA Picasso is a foundry for model creation powered by Kubernetes
* The workload is the training workload split into batches
* Challenge: Schedule multiple training jobs by different users that are prioritized
### Topology aware placments
* You need thousands of GPUs, a typical Node has 8 GPUs with fast NVLink communication - beyond that switching
* Target: optimize related jobs based on GPU node distance and NUMA placement
### Fault tolerance and resiliency
* Stuff can break, resulting in slowdowns or errors
* Challenge: Detect faults and handle them
* Observability both in-band and out ouf band that expose node conditions in kubernetes
* Needed: Automated fault-tolerant scheduling
### Multi-dimensional optimization
* There are different KPIs: starvation, prioprity, occupanccy, fainrness
* Challenge: What to choose (the multi-dimensional decision problemn)
* Needed: A scheduler that can balance the dimensions

View File

@ -0,0 +1,22 @@
---
title: Sponsored: Build an open source platform for ai/ml
weight: 4
---
Jorge Palma from Microsoft with a quick introduction.
## Baseline
* Kubernetes is cool and all
* Challenges:
* Containerized models
* GPUs in the cluster (install, management)
## Kubernetes AI Toolchain (KAITO)
* Kubernetes operator that interacts with
* Node provisioner
* Deployment
* Simple CRD that decribes a model, infra and have fun
* Creates inferance endpoint
* Models are currently 10 (Hugginface, LLMA, etc)

View File

@ -0,0 +1,16 @@
---
title: Optimizing performance and sustainability for ai
weight: 5
---
A panel discussion with moderation by Google and participants from Google, Alluxio, Apmpere and CERN.
It was pretty scripted with prepared (sponsor specific) slides for each question answered.
## Takeaways
* Deploying a ML should become the new deploy a web app
* The hardware should be fully utilized -> Better ressource sharing and scheduling
* Smaller LLMs on cpu only is preyy cost efficient
* Better scheduling by splitting into storage + cpu (prepare) and gpu (run) nodes to create a just-in-time flow
* Software acceleration is cool, but we should use more specialized hardware and models to run on CPUs
* We should be flexible regarding hardware, multi-cluster workloads and hybrig (onprem, burst to cloud) workloads

View File

@ -0,0 +1,43 @@
---
title: Cloudnative news show (AI edition)
weight: 6
---
Nikhita presented projects that merge CloudNative and AI.
PAtrick Ohly Joined for DRA
### The "news"
* New work group AI
* More tools are including ai features
* New updated cncf for children feat AI
* One decade of Kubernetes
* DRA is in alpha
### DRA
* A new API for resources (node-local and node-attached)
* Sharing of ressources between cods and containers
* Vendor specific stuff are abstracted by a vendor driver controller
* The kube scheduler can interact with the vendor parameters for scheduling and autoscaling
### Cloudnative AI ecosystem
* Kube is the seed for the AI infra plant
* Kubeflow users wanted AI registries
* LLM on the edge
* Opentelemetry bring semandtics
* All of these tools form a symbiosis between
* Topics of discussions
### The working group AI
* It was formed in october 2023
* They are working on the whitepaper (cloudnative and ai) wich was opublished on 19.03.2024
* The landscape "cloudnative and ai" is WIP and will be merged into the main CNCF landscape
* The future focus will be on security and cost efficiency (with a hint of sustainability)
### LFAI and CNCF
* The direcor of the AI foundation talks abouzt ai and cloudnative
* They are looking forward to more colaboraion

10
content/day2/_index.md Normal file
View File

@ -0,0 +1,10 @@
---
archetype: chapter
title: Day 2
---
Day two is also the official day one of KubeCon (Day one was just CloudNativeCon).
This is where all of the people joined (over 2000)
The opening keynotes were a mix of talks and panel discussions.
The main topic was - who could have guessed - AI and ML.