Compare commits
9 Commits
8b1edb32c3
...
8685fc71b9
| Author | SHA1 | Date | |
|---|---|---|---|
| 8685fc71b9 | |||
| 25aa419cc5 | |||
| 9f9371bd71 | |||
| 974f9f941d | |||
| cdf7163a27 | |||
| 8ce0ccda0d | |||
| 078e397fa7 | |||
| 06be05a410 | |||
| 6e9c7c728b |
@@ -7,5 +7,7 @@ tags:
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
<!-- {{% button href="https://github.com/graz-dev/automatic-reosurce-optimization-loop" style="info" icon="code" %}}Code/Demo{{% /button %}} -->
|
||||
|
||||
|
||||
TODO:
|
||||
@@ -9,6 +9,8 @@ This "blog" certainly contains a bunch of tyops.
|
||||
This is what typing the notes blindly in real time get's you.
|
||||
Every year I tell myself that I will fix them afterwards: To be fair I fix most of them but not all and that's fine.
|
||||
|
||||
Also the notes tend to start out strong early in the week (aka Rejekts + CloudNativeCon) and fall off in terms of density and depth.
|
||||
|
||||
## How did I get there?
|
||||
|
||||
I attended Cloud Native Rejekts and KubeCon + CloudNativeCon Europe 2026 in Amsterdam.
|
||||
|
||||
@@ -4,6 +4,11 @@ title: Day -1
|
||||
weight: 3
|
||||
---
|
||||
|
||||
This year there was only one day of Cloud Nativ Rejekts. So this was a down day. Well if your define finishing two talks downtime. But certainly no conference today.
|
||||
This year there was only one day of Cloud Nativ Rejekts. So this was a down day.
|
||||
Well if your define finishing two talks as downtime. But certainly no conference today.
|
||||
|
||||
Last year Rejekts happend on sunday and monday with the Co-Located events on tuesday and KubeCon from wednesday to friday.
|
||||
It was very cool having two full days of Rejekts last year but the day of preparation is certainly appreciated.
|
||||
|
||||
Also this is the day that most my friends (that are attending KubeCon) arrived.
|
||||
No one from back home attendes Rejekts but as mentioned in yesterday's notes I met some awesome people I get to see every year at these events alonside some new - but nevertheless cool - humans-
|
||||
|
||||
@@ -28,7 +28,7 @@ A talk by EDERA - one of the sponsors of Cloud Natice Rejekts.
|
||||
## Kubernetes joins the game
|
||||
|
||||
- Background: Kubernetes is built for containers and not for deep isolation
|
||||
- Existing solutions: KubeVirt (manage KVM through KubeAPI)m kada Containers (Deeper Sandbox), GVisor (emulated syscalls)
|
||||
- Existing solutions: KubeVirt (manage KVM through KubeAPI)m kata Containers (Deeper Sandbox), GVisor (emulated syscalls)
|
||||
- EDERA's idea: Their own CRI (container runtime interface) that makes vm management transparent and can run vms alongside containers
|
||||
- Potential Problems:
|
||||
- Kubernetes assumes that cgropups exist
|
||||
|
||||
@@ -8,7 +8,7 @@ tags:
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
TODO: Copy repo link for samples
|
||||
{{% button href="https://github.com/graz-dev/automatic-reosurce-optimization-loop" style="info" icon="code" %}}Code/Demo{{% /button %}}
|
||||
|
||||
The statistics of these talks are based on a survey including multiple companies, focused on ones that build and run applications
|
||||
|
||||
|
||||
39
content/day-2/06-kubeproxy.md
Normal file
39
content/day-2/06-kubeproxy.md
Normal file
@@ -0,0 +1,39 @@
|
||||
---
|
||||
title: "Unleashing the tides of kubernetes networking by removing kube-proxy"
|
||||
weight: 6
|
||||
tags:
|
||||
- rejekts
|
||||
- isovalent
|
||||
- cilium
|
||||
---
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
{{% button href="https://github.com/ttarczynski/cilium-kpr-demo" style="info" icon="code" %}}Code/Demo{{% /button %}}
|
||||
|
||||
A talk by isovalent (now part of cisco - god i love that they have to say this every time).
|
||||
It'S a good baseline introduction to how kubernetes service routing works but also a bit dry (in terms of the presentation itself).
|
||||
I skipped the introduction to cilium in these notes. The docs exist for a reason.
|
||||
|
||||
## Kubernetes Services - a baseline
|
||||
|
||||
- East-West: ClusterIP -> App2App inside the cluster
|
||||
- North-South: NodePort -> External Client to app in Cluster
|
||||
|
||||
## Kube-Proxy - IPTables Mode
|
||||
|
||||
- IPTables: Traffics flows through different tables/chains - most imporantly the NAT-Table
|
||||
- Every Node has it's own kube-proxy next to the kubelet
|
||||
- ClusterIP: Scales to a huge numer of rules when exposing multiple services
|
||||
- NodePort: Masquerades sources if routing cross-node (Source-IP is lost)
|
||||
|
||||
TODO: Steal iptables visualizer
|
||||
TODO: Steal livecycle of a packet clusterip
|
||||
TODO: Steal livecycle of a packet nodeport
|
||||
|
||||
## Kube-Proxy free
|
||||
|
||||
- Cilium deploys one agent pod per node that handles management of eBPF on the kernel
|
||||
- ClusterIP: LoadBalancing happens on the socket-level
|
||||
- NodePort: Also does SNAT
|
||||
- Adds hubble for observability
|
||||
50
content/day-2/07-chaosengineering.md
Normal file
50
content/day-2/07-chaosengineering.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
title: "How Chaos-Engineering works: Implementing Failure Injection on Kubernetes with Rust"
|
||||
weight: 7
|
||||
tags:
|
||||
- rejekts
|
||||
- chaos
|
||||
- rust
|
||||
- operator
|
||||
---
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
{{% button href="https://github.com/ioboi/kerris" style="info" icon="code" %}}Code/Demo{{% /button %}}
|
||||
|
||||
A general introduction to chaos engineering with specificly showing implementations by Chaos Mesh and Litmus.
|
||||
After that the talk continues into the implementation of a custom chaos operator, written in rust.
|
||||
|
||||
## Chaos Engineering
|
||||
|
||||
> Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the systems capability to withstand turbulent conditions in production
|
||||
> ~ Chaos Community (2015)
|
||||
|
||||
- CNCF-Projects: **Chaos Mesh**, **Litmus**, Chaos Blade, Krkn
|
||||
- Types: **Pod Chaos** (Delete/Terminate), **Pod Network Chaos** (Faults/Packet loss), Node Chaos, JVM Chaos, Infra Chaos(Reboot vms), ...
|
||||
|
||||
## APIs
|
||||
|
||||
- Chaos Mesh: A Specific CRD per Chaos Type (PodChaos, NetworkChaos, ...)
|
||||
- Litmus: Chaos Engine Config that defines the type in it's spec
|
||||
|
||||
TODO: Steal sample CRDs
|
||||
|
||||
## DIY
|
||||
|
||||
- Baseline: Written as a controller in Rust (out of curiosity) with support for Pod Chaos and Network Chaos
|
||||
- Entrypoint is a reconcile function that returns an action (requeue, etc) and an error
|
||||
- Network Chaos uses traffic controll (`tc` part of `iproute2`) to do the limiting and loss
|
||||
- If you're interesten in the rust implementation: Look at the code linked above
|
||||
|
||||
```mermaid
|
||||
graph LRT
|
||||
controller
|
||||
subgraph node
|
||||
daemon
|
||||
containerd
|
||||
|
||||
daemon-->|grpc|containerd
|
||||
end
|
||||
controller-->daemon
|
||||
```
|
||||
33
content/day-2/08-multitenancy.md
Normal file
33
content/day-2/08-multitenancy.md
Normal file
@@ -0,0 +1,33 @@
|
||||
---
|
||||
title: "Push the boundaries of kubernetes multi-tenancy with containerruntimeclasses"
|
||||
weight: 8
|
||||
tags:
|
||||
- rejekts
|
||||
- runtime
|
||||
---
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
<!-- {{% button href="https://github.com/graz-dev/automatic-reosurce-optimization-loop" style="info" icon="code" %}}Code/Demo{{% /button %}} -->
|
||||
|
||||
I missed the first 3 minutes of this talk because they started ealy so the notes are currently missing the first levels of multi-tenancy
|
||||
This was a real interesting introduction into the world of runtime classes and how you could use them to choose the right level of isolation for each of your pods/deployments utilizing different runtimes/shims. Running everything from normal containers to hardened/Emulated processes and vms side-by-side in kubernetes.
|
||||
|
||||
## Levels of multi-tenancy
|
||||
|
||||
- God-Level: A physical clusters seperated out in multiple virtual clusters which can be isolated into even more nested virtual clusters (for )
|
||||
- Problem: We're using the same container runtime
|
||||
|
||||
## Runtimes
|
||||
|
||||
- There are different runtimes since TODO -> They replaced dockershim as the runtime in 1.24
|
||||
- Choice can range from cri-o )performant) to kata containers (secure)
|
||||
- In the past there was no plugin architecture (node had to be reinstalled and restarted to switch cri) now you just have to update the container confug through a new RuntimeClass
|
||||
- Can be targeted for each Pod/Deployment Spec
|
||||
- You can still use containerd as the default class with shims (Shim v2 Project) for specialized runtimes like kata or windows
|
||||
- Expansion: KubeVirt - vms as a runtime class (also implemented by others like kata with qemu isolation)
|
||||
|
||||
## Pro/Con
|
||||
|
||||
- Pro: Security, Cost optimization, Performance optimization, diversity/flexibility
|
||||
- Con: Day2 complexity, complex debugging (anyone say networking), additional costs of using VMs
|
||||
63
content/day-2/09_clusternotflat.md
Normal file
63
content/day-2/09_clusternotflat.md
Normal file
@@ -0,0 +1,63 @@
|
||||
---
|
||||
title: "Yor Cluster Isn't flat: A First-Class API for Real-World Infrastructure Topology"
|
||||
weight: 9
|
||||
tags:
|
||||
- rejekts
|
||||
---
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
<!-- {{% button href="https://github.com/JesseStutler" style="info" icon="code" %}}Code/Demo{{% /button %}} -->
|
||||
|
||||
By a volcano maintainer from Huawei - a very wholesome guy.
|
||||
I don't know why the organizers always tend to schedule these very technical topic by people with a bit of an harder accent (totally understandable but very quiet) near the end of the conference or day? I thank the Sakura Edition Red Bull for keeping my attention span up and running for the last two sessions of the day.
|
||||
|
||||
## History of vokcano
|
||||
|
||||
- 2017: Kube-Batch open soruce
|
||||
- 2019: Volcano Open Source
|
||||
- 2020: CNCF Sandbox
|
||||
- 2022: CNCF Incubation
|
||||
- 2026: Road to graduation
|
||||
|
||||
## Volcano feature overview
|
||||
|
||||
- Unified Scheduler
|
||||
- Queue Management
|
||||
- Workload Colocation
|
||||
- Multi cluster scheduling
|
||||
- Heterogenus Device Support
|
||||
- Multiple Scheduling policies
|
||||
|
||||
## Why topology awareness?
|
||||
|
||||
- Scenario 1: Bottlenecks in LLM-Training when jobns are not placed on GPUs that are close
|
||||
- Scenario 2: Inference runs as Seperate Prefill and Decode Jobs on different hardware -> Short network hops needed
|
||||
- Node labels can be used but are very limited
|
||||
- Datacenter network architectures are heterogenus -> Everyone can buil in their own style
|
||||
|
||||
## Scheduler notation mechansis
|
||||
|
||||
- Label: Kueue, Koordinator, KAI Scheduler
|
||||
- Vendor-Specific Syntax
|
||||
- No hierarchy
|
||||
- Need to be manually set
|
||||
- No healthchecks
|
||||
- Cloud Specific
|
||||
- CRD (Long term): Volcano
|
||||
- Standardized API (HyperNBode)
|
||||
- Hierarchical (Trees/Zones)
|
||||
- Auto-discovery - Plugin-Ready (e.G. NVIDIA)
|
||||
- Healhchecks
|
||||
- Unified across clouds and on-prem
|
||||
|
||||
## Architecture CRD Sample
|
||||
|
||||
TODO: Steal Leaf sample from slides
|
||||
|
||||
|
||||
## What's next
|
||||
|
||||
- GPU 3D Architectures (Internal interconnects, NUMA, external interconnects)
|
||||
- DRA integration/collabaration
|
||||
- Promotion of HyperNode to a first-class citizen -> Extraction from Volcano to be truly generic
|
||||
79
content/day-2/10_kcpcrossplane.md
Normal file
79
content/day-2/10_kcpcrossplane.md
Normal file
@@ -0,0 +1,79 @@
|
||||
---
|
||||
title: "Achiving Platform Engineering Multi-Tenancy with kcp and Crossplane"
|
||||
weight: 2
|
||||
tags:
|
||||
- rejekts
|
||||
- kcp
|
||||
- crossplane
|
||||
- kubermatic
|
||||
- upbound
|
||||
---
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
{{% button href="https://github.com/SimonTheLeg/crossplane-and-kcp-demo" style="info" icon="code" %}}Code/Demo{{% /button %}}
|
||||
|
||||
An introductory talk to kcp and crossplanes by the companies maintaining both of them.
|
||||
|
||||
## The basics
|
||||
|
||||
- A platform should me automated and self-service driven to count as platform engineering
|
||||
- Provider teams: Certificates, databases, ...
|
||||
- Consumer teams: Want to use a provided Service
|
||||
- IDP: Sits in the middle -> The real hard part
|
||||
|
||||
## KCP
|
||||
|
||||
- Idea: Why not use Kubernetes as our API-Layer? It tracks API ownership, versioning and resource managment and has built-in extensibility (CRD)
|
||||
- Problems:
|
||||
- APIs are always cluster-scoped (you advertise them to everyone) -> You could give everyone a cluster
|
||||
- Ramping up a new cluster takes time and resources -> Let's just create a lightweight hosted control plane with it's own datastore
|
||||
- Sharing APIs to multiple clusters is hard -> Leightweight control planes with a shared datastore
|
||||
- Solution: Workspaces that are organized in a tree and each workspace contains it's own CRDs and RBAC -> All resources (e.g. namespaces) exist just in their own workspace
|
||||
- API-Sharing; APIExport CRD and APIBinding CRD (reference via the workspace path of the APIExport)
|
||||
- Running the operators that work on the APIs: Virtual Workspaces (virtually connects your operator to all of their resources across kcp via a magic kubeconfig) -> Controller needs to be built with multicluster-runtime (drop in replacement for controler runtime)
|
||||
- KCP API-Syncagent allows you to use a existing operator without modifying it for use with multicluster-runtime
|
||||
|
||||
```mermaid
|
||||
graph
|
||||
KCP
|
||||
Datastore
|
||||
User
|
||||
subgraph Workspace
|
||||
APIs[API/CRD]
|
||||
RBAC
|
||||
end
|
||||
KCP-->|interact with|Datastore
|
||||
User-->|Create tenant|KCP
|
||||
KCP-->|Manages|Workspace
|
||||
KCP-->|Return kubeconfig|User
|
||||
User-->|Uses KCP like the apiserver|KCP
|
||||
```
|
||||
|
||||
## Crossplane
|
||||
|
||||
- Providers for all kunds of resources (kubernetes or infra/cloud)
|
||||
- Compositions for higher level abstractions accross one or multiple providers
|
||||
- Uses the Kubernetes API (aka CRDs) as it's api to enable integration with standardized tooling (like GitOps)
|
||||
|
||||
```yaml
|
||||
apiVersion: ...
|
||||
kind: CompositeResourceDefinition
|
||||
spec:
|
||||
compositetyperef:
|
||||
group: my.exdample/v1aplha1
|
||||
kind: Test
|
||||
mode: pipeline
|
||||
pipeline:
|
||||
- ...
|
||||
```
|
||||
|
||||
## The demo
|
||||
|
||||
I recommend watching the recording but thul shall serve as a overview of the scenario.
|
||||
Or run it locally (code linked above).
|
||||
|
||||
- User whants to order a new database in their workspace a
|
||||
- Database team offers their API through their database workspace
|
||||
- Database team runs their operator in their own cluster
|
||||
- kcp api-syncagent swyncs the database crd from workspace a into the db team's cluster and the connection-secrets back to the workspace
|
||||
@@ -17,8 +17,17 @@ I have to admit that I'm very bad with names and don't always regocnize people b
|
||||
## Talk recommendations
|
||||
|
||||
- If you're building operators: [Solving Operator Extensibility: A gRPC Plugin Framework for kubernetes](./04_operator-estensibility)
|
||||
- [Intro to both chaos engineering and building operators that interact with containerd in rust](./07-chaosengineering)
|
||||
- The idea behind [The self-improving platform: Closing the Loop Between Telemetry and Tuning](./05_selvimproving) is very interesting but the first half of the talk is kinda confusing as it discusses a study that could have been shortened drasticly. But the way they automaticly create PRs for resource utilizations is cool
|
||||
- [A good introduction to kcp and crossplane](./10_kcpcrossplane)
|
||||
|
||||
## Other stuff I learned or people i talk to
|
||||
|
||||
- TODO:
|
||||
- Arik about dprecation of CNCF projects
|
||||
- Simon and Koray about demo prep for talks
|
||||
- Arik and Simon about the review process for conference talks
|
||||
- Nico
|
||||
- Stephan
|
||||
- A nice guy who's name i forgot (did i mention that I'm bad with names yet?) about the process of bleaching/dyeing my hair (he asked for a friend)
|
||||
- A group of random people in the elevator about Neon Genisis Evangelion (not a tech-topic but hey)
|
||||
- And a bunch of smalltalk and deeptalk with the awesome attendees
|
||||
@@ -8,14 +8,4 @@ TODO:
|
||||
|
||||
## Other stuff I learned or people i talk to
|
||||
|
||||
- Isovalent
|
||||
- Kubermatic
|
||||
- Portworx
|
||||
- Fastly
|
||||
- Syseleven
|
||||
- Netbird
|
||||
- VMware
|
||||
- Stackit
|
||||
- Harness
|
||||
- Mia Platform
|
||||
- and many, many more...
|
||||
- TODO:
|
||||
Reference in New Issue
Block a user