docs(day2): Added edge cpu talk notes

2025-07-22 11:19:22 +02:00
parent 11e3866f01
commit 84abb7e1b9
2 changed files with 63 additions and 2 deletions
--- a/content/day2/03_k3s-gpu.md
+++ b/content/day2/03_k3s-gpu.md
@@ -0,0 +1,56 @@
+---
+title: "Brains on the edge - running ai workloads with k3s and gpu nodes"
+weight: 3
+tags:
+ - ai
+ - gpu
+---
+
+<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
+<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
+
+I decided not to note down the usual "typical challenges on the edge" slides (about 10 mins of the talk)
+
+## Baseline
+
+- Edge can be split up: Near Edge, Far Edge, Device Edge
+- They use k3s for all edge clusters
+
+## Prerequisites
+
+- Software: GPU Driver, Container Toolkit, Device Plugin
+- Hardware: NVIDIA GPU with a supported distro
+- Runtime: Not all runtimes support GPUs (containerd and CRI-O do)
+
+## Architecture
+
+```mermaid
+graph LR
+subgraph Edge
+    MQTT
+    Kafka
+    Analytics
+
+    MQTT-->|Publish collected sensor data|Kafka
+    Kafka-->|Provide data to run|Analytics
+end
+subgraph Azure
+    Storage
+    Monitoring
+    MLFlow
+
+    Storage-->|Provide long term analytics|MLFlow
+end
+
+Analytics<-->|Sync models|MLFlow
+Kafka-->|Save to long term|Storage
+Monitoring-.->|Observe|Storage
+Monitoring-.->|Observe|MLFlow
+```
+
+## Q&A
+
+- Did you use the nvidia gpu operator: Yes
+- Which runtime did you use: ContainerD via K3S
+- Why k3s over k0s: Because we used it
+- Were you power limited: Nope, the edge was on a large ship