Day 2 typos

This commit is contained in:
Nicolai Ort 2024-03-26 15:00:48 +01:00
parent e2e3b2fdf3
commit 7b1203c7a3
Signed by: niggl
GPG Key ID: 13AFA55AF62F269F
14 changed files with 187 additions and 141 deletions

View File

@ -36,3 +36,49 @@ multicluster
Statefulset Statefulset
eBPF eBPF
Parca Parca
KubeCon
FinOps
moondream
OLLAMA
LLVA
LLAVA
bokllava
NVLink
CUDA
Space-seperated
KAITO
Hugginface
LLMA
Alluxio
LLMs
onprem
Kube
Kubeflow
Ohly
distroless
init
Distroless
Buildkit
busybox
ECK
Kibana
Dedup
Crossplane
autoprovision
RBAC
Serviceaccount
CVEs
Podman
LinkerD
sidecarless
Kubeproxy
Daemonset
zTunnel
HBONE
Paketo
KORFI
Traefik
traefik
Vercel
Isovalent
CNIs

View File

@ -6,7 +6,7 @@ tags:
- opening - opening
--- ---
The opening keynote started - as is the tradition with keynotes - with an "motivational" opening video. The opening keynote started - as is the tradition with keynotes - with a "motivational" opening video.
The keynote itself was presented by the CEO of the CNCF. The keynote itself was presented by the CEO of the CNCF.
## The numbers ## The numbers
@ -17,7 +17,7 @@ The keynote itself was presented by the CEO of the CNCF.
## The highlights ## The highlights
* Everyone uses cloudnative * Everyone uses cloud native
* AI uses Kubernetes b/c the UX is way better than classic tools * AI uses Kubernetes b/c the UX is way better than classic tools
* Especially when transferring from dev to prod * Especially when transferring from dev to prod
* We need standardization * We need standardization
@ -26,10 +26,10 @@ The keynote itself was presented by the CEO of the CNCF.
## Live demo ## Live demo
* KIND cluster on desktop * KIND cluster on desktop
* Protptype Stack (develop on client) * Prototype Stack (develop on client)
* Kubernetes with the LLM * Kubernetes with the LLM
* Host with LLVA (image describe model), moondream and OLLAMA (the model manager/registry() * Host with LLAVA (image describe model), moondream and OLLAMA (the model manager/registry()
* Prod Stack (All in kube) * Prod Stack (All in kube)
* Kubernetes with LLM, LLVA, OLLAMA, moondream * Kubernetes with LLM, LLVA, OLLAMA, moondream
* Available Models: llava, mistral bokllava (llava*mistral) * Available Models: LLAVA, mistral bokllava (LLAVA*mistral)
* Host takes picture, ai describes what is pictures (in our case the conference audience) * Host takes picture, AI describes what is pictures (in our case the conference audience)

View File

@ -7,7 +7,7 @@ tags:
- panel - panel
--- ---
A podium discussion (somewhat scripted) lead by Pryanka A podium discussion (somewhat scripted) lead by Priyanka
## Guests ## Guests
@ -17,24 +17,24 @@ A podium discussion (somewhat scripted) lead by Pryanka
## Discussion ## Discussion
* What do you use as the base of dev for ollama * What do you use as the base of dev for OLLAMA
* Jeff: The concepts from docker, git, kubernetes * Jeff: The concepts from docker, git, Kubernetes
* How is the balance between ai engi and ai ops * How is the balance between AI engineer and AI ops
* Jeff: The classic dev vs ops devide, many ML-Engi don't think about * Jeff: The classic dev vs ops divide, many ML-Engineer don't think about
* Paige: Yessir * Paige: Yessir
* How does infra keep up with the fast research * How does infra keep up with the fast research
* Paige: Well, they don't - but they do their best and Cloudnative is cool * Paige: Well, they don't - but they do their best and Cloud native is cool
* Jeff: Well we're not google, but kubernetes is the saviour * Jeff: Well we're not google, but Kubernetes is the savior
* What are scaling constraints * What are scaling constraints
* Jeff: Currently sizing of models is still in it's infancy * Jeff: Currently sizing of models is still in its infancy
* Jeff: There will be more specific hardware and someone will have to support it * Jeff: There will be more specific hardware and someone will have to support it
* Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization) * Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization)
* Paige: Optimization of smaller models * Paige: Optimization of smaller models
* What technologies need to be open source licensed * What technologies need to be open source licensed
* Jeff: The model b/c access and trust * Jeff: The model b/c access and trust
* Tim: The models and base execution environemtn -> Vendor agnosticism * Tim: The models and base execution environment -> Vendor agnosticism
* Paige: Yes and remixes are really imporant for development * Paige: Yes and remixes are really important for development
* Anything else * Anything else
* Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world * Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world
* Paige: Currently many people just use paid apis to abstract the infra, but we need this stuff selfhostable * Paige: Currently many people just use paid APIs to abstract the infra, but we need this stuff self-hostable
* Tim: I don'T want to know about the hardware, the whole infra side should be done by the cloudnative teams to let ML-Engi to just be ML-Engine * Tim: I don't want to know about the hardware, the whole infra side should be done by the cloud native teams to let ML-Engineer to just be ML-Engine

View File

@ -9,7 +9,7 @@ tags:
Kevin and Sanjay from NVIDIA Kevin and Sanjay from NVIDIA
## Enabeling GPUs in Kubernetes today ## Enabling GPUs in Kubernetes today
* Host level components: Toolkit, drivers * Host level components: Toolkit, drivers
* Kubernetes components: Device plugin, feature discovery, node selector * Kubernetes components: Device plugin, feature discovery, node selector
@ -18,24 +18,24 @@ Kevin and Sanjay from NVIDIA
## GPU sharing ## GPU sharing
* Time slicing: Switch around by time * Time slicing: Switch around by time
* Multi Process Service: Run allways on the GPU but share (space-) * Multi Process Service: Always run on the GPU but share (space-)
* Multi Instance GPU: Space-seperated sharing on the hardware * Multi Instance GPU: Space-seperated sharing on the hardware
* Virtual GPU: Virtualices Time slicing or MIG * Virtual GPU: Virtualizes Time slicing or MIG
* CUDA Streams: Run multiple kernels in a single app * CUDA Streams: Run multiple kernels in a single app
## Dynamic resource allocation ## Dynamic resource allocation
* A new alpha feature since Kube 1.26 for dynamic ressource requesting * A new alpha feature since Kube 1.26 for dynamic resource requesting
* You just request a ressource via the API and have fun * You just request a resource via the API and have fun
* The sharing itself is an implementation detail * The sharing itself is an implementation detail
## GPU scale out challenges ## GPU scale-out challenges
* NVIDIA Picasso is a foundry for model creation powered by Kubernetes * NVIDIA Picasso is a foundry for model creation powered by Kubernetes
* The workload is the training workload split into batches * The workload is the training workload split into batches
* Challenge: Schedule multiple training jobs by different users that are prioritized * Challenge: Schedule multiple training jobs by different users that are prioritized
### Topology aware placments ### Topology aware placements
* You need thousands of GPUs, a typical Node has 8 GPUs with fast NVLink communication - beyond that switching * You need thousands of GPUs, a typical Node has 8 GPUs with fast NVLink communication - beyond that switching
* Target: optimize related jobs based on GPU node distance and NUMA placement * Target: optimize related jobs based on GPU node distance and NUMA placement
@ -44,11 +44,11 @@ Kevin and Sanjay from NVIDIA
* Stuff can break, resulting in slowdowns or errors * Stuff can break, resulting in slowdowns or errors
* Challenge: Detect faults and handle them * Challenge: Detect faults and handle them
* Observability both in-band and out ouf band that expose node conditions in kubernetes * Observability both in-band and out of band that expose node conditions in Kubernetes
* Needed: Automated fault-tolerant scheduling * Needed: Automated fault-tolerant scheduling
### Multi-dimensional optimization ### Multidimensional optimization
* There are different KPIs: starvation, prioprity, occupanccy, fainrness * There are different KPIs: starvation, priority, occupancy, fairness
* Challenge: What to choose (the multi-dimensional decision problemn) * Challenge: What to choose (the multidimensional decision problem)
* Needed: A scheduler that can balance the dimensions * Needed: A scheduler that can balance the dimensions

View File

@ -15,11 +15,11 @@ Jorge Palma from Microsoft with a quick introduction.
* Containerized models * Containerized models
* GPUs in the cluster (install, management) * GPUs in the cluster (install, management)
## Kubernetes AI Toolchain (KAITO) ## Kubernetes AI Tool chain (KAITO)
* Kubernetes operator that interacts with * Kubernetes operator that interacts with
* Node provisioner * Node provisioner
* Deployment * Deployment
* Simple CRD that decribes a model, infra and have fun * Simple CRD that describes a model, infra and have fun
* Creates inferance endpoint * Creates inference endpoint
* Models are currently 10 (Hugginface, LLMA, etc) * Models are currently 10 (Hugginface, LLMA, etc.)

View File

@ -6,14 +6,14 @@ tags:
- panel - panel
--- ---
A panel discussion with moderation by Google and participants from Google, Alluxio, Apmpere and CERN. A panel discussion with moderation by Google and participants from Google, Alluxio, Ampere and CERN.
It was pretty scripted with prepared (sponsor specific) slides for each question answered. It was pretty scripted with prepared (sponsor specific) slides for each question answered.
## Takeaways ## Takeaways
* Deploying a ML should become the new deploy a web app * Deploying an ML should become the new deployment a web app
* The hardware should be fully utilized -> Better ressource sharing and scheduling * The hardware should be fully utilized -> Better resource sharing and scheduling
* Smaller LLMs on cpu only is preyy cost efficient * Smaller LLMs on CPU only is pretty cost-efficient
* Better scheduling by splitting into storage + cpu (prepare) and gpu (run) nodes to create a just-in-time flow * Better scheduling by splitting into storage + CPU (prepare) and GPU (run) nodes to create a just-in-time flow
* Software acceleration is cool, but we should use more specialized hardware and models to run on CPUs * Software acceleration is cool, but we should use more specialized hardware and models to run on CPUs
* We should be flexible regarding hardware, multi-cluster workloads and hybrig (onprem, burst to cloud) workloads * We should be flexible regarding hardware, multi-cluster workloads and hybrid (onprem, burst to cloud) workloads

View File

@ -5,41 +5,41 @@ tags:
- keynote - keynote
--- ---
Nikhita presented projects that merge CloudNative and AI. Nikhita presented projects that merge cloud native and AI.
PAtrick Ohly Joined for DRA Patrick Ohly Joined for DRA
### The "news" ### The "news"
* New work group AI * New work group AI
* More tools are including ai features * More tools are including AI features
* New updated cncf for children feat AI * New updated CNCF for children feat AI
* One decade of Kubernetes * One decade of Kubernetes
* DRA is in alpha * DRA is in alpha
### DRA ### DRA
* A new API for resources (node-local and node-attached) * A new API for resources (node-local and node-attached)
* Sharing of ressources between cods and containers * Sharing of resources between cods and containers
* Vendor specific stuff are abstracted by a vendor driver controller * Vendor specific stuff are abstracted by a vendor driver controller
* The kube scheduler can interact with the vendor parameters for scheduling and autoscaling * The kube scheduler can interact with the vendor parameters for scheduling and autoscaling
### Cloudnative AI ecosystem ### Cloud native AI ecosystem
* Kube is the seed for the AI infra plant * Kube is the seed for the AI infra plant
* Kubeflow users wanted AI registries * Kubeflow users wanted AI registries
* LLM on the edge * LLM on the edge
* Opentelemetry bring semandtics * OpenTelemetry bring semantics
* All of these tools form a symbiosis between * All of these tools form a symbiosis between
* Topics of discussions * Topics of discussions
### The working group AI ### The working group AI
* It was formed in october 2023 * It was formed in October 2023
* They are working on the whitepaper (cloudnative and ai) wich was opublished on 19.03.2024 * They are working on the white paper (cloud native and AI) which was published on 19.03.2024
* The landscape "cloudnative and ai" is WIP and will be merged into the main CNCF landscape * The landscape "cloud native and AI" is WIP and will be merged into the main CNCF landscape
* The future focus will be on security and cost efficiency (with a hint of sustainability) * The future focus will be on security and cost efficiency (with a hint of sustainability)
### LFAI and CNCF ### LFAI and CNCF
* The direcor of the AI foundation talks abouzt ai and cloudnative * The director of the AI foundation talks about AI and cloud native
* They are looking forward to more colaboraion * They are looking forward to more collaboration

View File

@ -14,7 +14,7 @@ The entire talk was very short, but it was a nice demo of init containers
* Security is hard - distroless sounds like a nice helper * Security is hard - distroless sounds like a nice helper
* Basic Challenge: Usability-Security Dilemma -> But more usability doesn't mean less secure, but more updating * Basic Challenge: Usability-Security Dilemma -> But more usability doesn't mean less secure, but more updating
* Distro: Kernel + Software Packages + Package manager (optional) -> In Containers just without the kernel * Distro: Kernel + Software Packages + Package manager (optional) -> In Containers just without the kernel
* Distroless: No package manager, no shell, no webcluent (curl/wget) - only minimal sofware bundels * Distroless: No package manager, no shell, no web client (curl/wget) - only minimal software bundles
## Tools for distroless image creation ## Tools for distroless image creation
@ -29,13 +29,13 @@ The entire talk was very short, but it was a nice demo of init containers
## Demo ## Demo
* A (rough) distroless postgres with alpine build step and scratch final step * A (rough) distroless Postgres with alpine build step and scratch final step
* A basic pg:alpine container used for init with a shared data volume * A basic pg:alpine container used for init with a shared data volume
* The init uses the pg admin user to initialize the pg server (you don't need the admin creds after this) * The init uses the pg admin user to initialize the pg server (you don't need the admin credentials after this)
### Kube ### Kube
* K apply failed b/c no internet, but was fixed by connecting to wifi * K apply failed b/c no internet, but was fixed by connecting to Wi-Fi
* Without the init container the pod just crashes, with the init container the correct config gets created * Without the init container the pod just crashes, with the init container the correct config gets created
### Docker compose ### Docker compose

View File

@ -13,63 +13,63 @@ A talk by elastic.
## About elastic ## About elastic
* Elestic cloud as a managed service * Elastic cloud as a managed service
* Deployed across AWS/GCP/Azure in over 50 regions * Deployed across AWS/GCP/Azure in over 50 regions
* 600.000+ Containers * 600000+ Containers
### Elastic and Kube ### Elastic and Kube
* They offer elastic obervability * They offer elastic observability
* They offer the ECK operator for simplified deployments * They offer the ECK operator for simplified deployments
## The baseline ## The baseline
* Goal: A large scale (1M+ containers resilient platform on k8s * Goal: A large scale (1M+ containers) resilient platform on k8s
* Architecture * Architecture
* Global Control: The control plane (api) for users with controllers * Global Control: The control plane (API) for users with controllers
* Regional Apps: The "shitload" of kubernetes clusters where the actual customer services live * Regional Apps: The "shitload" of Kubernetes clusters where the actual customer services live
## Scalability ## Scalability
* Challenge: How large can our cluster be, how many clusters do we need * Challenge: How large can our cluster be, how many clusters do we need
* Problem: Only basic guidelines exist for that * Problem: Only basic guidelines exist for that
* Decision: Horizontaly scale the number of clusters (5ßß-1K nodes each) * Decision: Horizontally scale the number of clusters (5ßß-1K nodes each)
* Decision: Disposable clusters * Decision: Disposable clusters
* Throw away without data loss * Throw away without data loss
* Single source of throuth is not cluster etcd but external -> No etcd backups needed * Single source of truth is not cluster etcd but external -> No etcd backups needed
* Everything can be recreated any time * Everything can be recreated any time
## Controllers ## Controllers
{{% notice style="note" %}} {{% notice style="note" %}}
I won't copy the explanations of operators/controllers in this notes I won't copy the explanations of operators/controllers in these notes
{{% /notice %}} {{% /notice %}}
* Many different controllers, including (but not limited to) * Many controllers, including (but not limited to)
* cluster controler: Register cluster to controller * cluster controller: Register cluster to controller
* Project controller: Schedule user's project to cluster * Project controller: Schedule user's project to cluster
* Product controllers (Elasticsearch, Kibana, etc.) * Product controllers (Elasticsearch, Kibana, etc.)
* Ingress/Certmanager * Ingress/Cert manager
* Sometimes controllers depend on controllers -> potential complexity * Sometimes controllers depend on controllers -> potential complexity
* Pro: * Pro:
* Resilient (Selfhealing) * Resilient (Self-healing)
* Level triggered (desired state vs procedure triggered) * Level triggered (desired state vs procedure triggered)
* Simple reasoning when comparing desired state vs state machine * Simple reasoning when comparing desired state vs state machine
* Official controller runtime lib * Official controller runtime lib
* Workque: Automatic Dedup, Retry backoff and so on * Workqueue: Automatic Dedup, Retry back off and so on
## Global Controllers ## Global Controllers
* Basic operation * Basic operation
* Uses project config from Elastic cloud as the desired state * Uses project config from Elastic cloud as the desired state
* The actual state is a k9s ressource in another cluster * The actual state is a k9s resource in another cluster
* Challenge: Where is the source of thruth if the data is not stored in etc * Challenge: Where is the source of truth if the data is not stored in etcd
* Solution: External datastore (postgres) * Solution: External data store (Postgres)
* Challenge: How do we sync the db sources to kubernetes * Challenge: How do we sync the db sources to Kubernetes
* Potential solutions: Replace etcd with the external db * Potential solutions: Replace etcd with the external db
* Chosen solution: * Chosen solution:
* The controllers don't use CRDs for storage, but they expose a webapi * The controllers don't use CRDs for storage, but they expose a web-API
* Reconciliation still now interacts with the external db and go channels (que) instead * Reconciliation still now interacts with the external db and go channels (queue) instead
* Then the CRs for the operators get created by the global controller * Then the CRs for the operators get created by the global controller
### Large scale ### Large scale
@ -82,10 +82,10 @@ I won't copy the explanations of operators/controllers in this notes
### Reconcile ### Reconcile
* User-driven events are processed asap * User-driven events are processed asap
* reconcole of everything should happen, bus with low prio slowly in the background * reconcile of everything should happen, bus with low priority slowly in the background
* Solution: Status: LastReconciledRevision (timestamp) get's compare to revision, if larger -> User change * Solution: Status: LastReconciledRevision (timestamp) gets compare to revision, if larger -> User change
* Prioritization: Just a custom event handler with the normal queue and a low prio * Prioritization: Just a custom event handler with the normal queue and a low priority
* Low Prio Queue: Just a queue that adds items to the normal work-queue with a rate limit * Queue: Just a queue that adds items to the normal work-queue with a rate limit
```mermaid ```mermaid
flowchart LR flowchart LR

View File

@ -6,39 +6,39 @@ tags:
- security - security
--- ---
A talk by Google and Microsoft with the premise of bether auth in k8s. A talk by Google and Microsoft with the premise of better auth in k8s.
## Baselines ## Baselines
* Most access controllers have read access to all secrets -> They are not really designed for keeping these secrets * Most access controllers have read access to all secrets -> They are not really designed for keeping these secrets
* Result: CVEs * Result: CVEs
* Example: Just use ingress, nginx, put in some lua code in the config and voila: Service account token * Example: Just use ingress, nginx, put in some Lua code in the config and e voilà: Service account token
* Fix: No more fun * Fix: No more fun
## Basic solutions ## Basic solutions
* Seperate Control (the controller) from data (the ingress) * Separate Control (the controller) from data (the ingress)
* Namespace limited ingress * Namespace limited ingress
## Current state of cross namespace stuff ## Current state of cross namespace stuff
* Why: Reference tls cert for gateway api in the cert team'snamespace * Why: Reference TLS cert for gateway API in the cert team's namespace
* Why: Move all ingress configs to one namespace * Why: Move all ingress configs to one namespace
* Classic Solution: Annotations in contour that references a namespace that contains all certs (rewrites secret to certs/secret) * Classic Solution: Annotations in contour that references a namespace that contains all certs (rewrites secret to certs/secret)
* Gateway Solution: * Gateway Solution:
* Gateway TLS secret ref includes a namespace * Gateway TLS secret ref includes a namespace
* ReferenceGrant pretty mutch allows referencing from X (Gatway) to Y (Secret) * ReferenceGrant pretty much allows referencing from X (Gateway) to Y (Secret)
* Limits: * Limits:
* Has to be implemented via controllers * Has to be implemented via controllers
* The controllers still have readall - they just check if they are supposed to do this * The controllers still have read all - they just check if they are supposed to do this
## Goals ## Goals
### Global ### Global
* Grant access to controller to only ressources relevant for them (using references and maybe class segmentation) * Grant access to controller to only resources relevant for them (using references and maybe class segmentation)
* Allow for safe cross namespace references * Allow for safe cross namespace references
* Make it easy for api devs to adopt it * Make it easy for API devs to adopt it
### Personas ### Personas
@ -50,20 +50,20 @@ A talk by Google and Microsoft with the premise of bether auth in k8s.
* Alex: Define relationships via ReferencePatterns * Alex: Define relationships via ReferencePatterns
* Kai: Specify controller identity (Serviceaccount), define relationship API * Kai: Specify controller identity (Serviceaccount), define relationship API
* Rohan: Define cross namespace references (aka ressource grants that allow access to their ressources) * Rohan: Define cross namespace references (aka resource grants that allow access to their resources)
## Result of the paper ## Result of the paper
### Architecture ### Architecture
* ReferencePattern: Where do i find the references -> example: GatewayClass in the gateway API * ReferencePattern: Where do i find the references -> example: GatewayClass in the gateway API
* ReferenceConsumer: Who (IOdentity) has access under which conditions? * ReferenceConsumer: Who (Identity) has access under which conditions?
* ReferenceGrant: Allow specific references * ReferenceGrant: Allow specific references
### POC ### POC
* Minimum access: You only get access if the grant is there AND the reference actually exists * Minimum access: You only get access if the grant is there AND the reference actually exists
* Their basic implementation works with the kube api * Their basic implementation works with the kube API
### Open questions ### Open questions
@ -74,9 +74,9 @@ A talk by Google and Microsoft with the premise of bether auth in k8s.
## Alternative ## Alternative
* Idea: Just extend RBAC Roles with a selector (match labels, etc) * Idea: Just extend RBAC Roles with a selector (match labels, etc.)
* Problems: * Problems:
* Requires changes to kubernetes core auth * Requires changes to Kubernetes core auth
* Everything bus list and watch is a pain * Everything bus list and watch is a pain
* How do you handle AND vs OR selection * How do you handle AND vs OR selection
* Field selectors: They exist * Field selectors: They exist
@ -84,5 +84,5 @@ A talk by Google and Microsoft with the premise of bether auth in k8s.
## Meanwhile ## Meanwhile
* Prefer tools that support isolatiobn between controller and dataplane * Prefer tools that support isolation between controller and data-plane
* Disable all non-needed features -> Especially scripting * Disable all non-needed features -> Especially scripting

View File

@ -6,32 +6,32 @@ tags:
- dx - dx
--- ---
A talk by UX and software people at RedHat (Podman team). A talk by UX and software people at Red Hat (Podman team).
The talk mainly followed the academic study process (aka this is the survey I did for my bachelors/masters thesis). The talk mainly followed the academic study process (aka this is the survey I did for my bachelor's/master's thesis).
## Research ## Research
* User research Study including 11 devs and platform engineers over three months * User research Study including 11 devs and platform engineers over three months
* Focus was on an new podman desktop feature * Focus was on a new Podman desktop feature
* Experence range 2-3 years experience average (low no experience, high oldschool kube) * Experience range 2-3 years experience average (low no experience, high old school kube)
* 16 questions regarding environment, workflow, debugging and pain points * 16 questions regarding environment, workflow, debugging and pain points
* Analysis: Affinity mapping * Analysis: Affinity mapping
## Findings ## Findings
* Where do I start when things are broken? -> There may be solutions, but devs don't know about them * Where do I start when things are broken? -> There may be solutions, but devs don't know about them
* Network debugging is hard b/c many layers and problems occuring in between cni and infra are really hard -> Network topology issues are rare but hard * Network debugging is hard b/c many layers and problems occurring in between CNI and infra are really hard -> Network topology issues are rare but hard
* YAML identation -> Tool support is needed for visualisation * YAML indentation -> Tool support is needed for visualization
* YAML validation -> Just use validation in dev and gitops * YAML validation -> Just use validation in dev and GitOps
* YAML Cleanup -> Normalize YAML (order, anchors, etc) for easy diff * YAML Cleanup -> Normalize YAML (order, anchors, etc.) for easy diff
* Inadequate security analysis (too verbose, non-issues are warnings) -> Realtime insights (and during dev) * Inadequate security analysis (too verbose, non-issues are warnings) -> Real-time insights (and during dev)
* Crash Loop -> Identify stuck containers, simple debug containers * Crash Loop -> Identify stuck containers, simple debug containers
* CLI vs GUI -> Enable eperience level oriented gui, Enhance intime troubleshooting * CLI vs GUI -> Enable experience level oriented GUI, Enhance in-time troubleshooting
## General issues ## General issues
* No direct fs access * No direct fs access
* Multiple kubeconfigs * Multiple kubeconfigs
* SaaS is sometimes only provided on kube, which sounds like complexity * SaaS is sometimes only provided on kube, which sounds like complexity
* Where do i begin my troubleshooting * Where do I begin my troubleshooting
* Interoperability/Fragility with updates * Interoperability/Fragility with updates

View File

@ -6,11 +6,11 @@ tags:
- network - network
--- ---
Global field CTO at Solo.io with a hint of servicemesh background. Global field CTO at Solo.io with a hint of service mesh background.
## History ## History
* LinkerD 1.X was the first moder servicemesh and basicly a opt-in serviceproxy * LinkerD 1.X was the first modern service mesh and basically an opt-in service proxy
* Challenges: JVM (size), latencies, ... * Challenges: JVM (size), latencies, ...
### Why not node-proxy? ### Why not node-proxy?
@ -23,8 +23,8 @@ Global field CTO at Solo.io with a hint of servicemesh background.
### Why sidecar? ### Why sidecar?
* Transparent (ish) * Transparent (ish)
* PArt of app lifecycle (up/down) * Part of app lifecycle (up/down)
* Single tennant * Single tenant
* No noisy neighbor * No noisy neighbor
### Sidecar drawbacks ### Sidecar drawbacks
@ -46,7 +46,7 @@ Global field CTO at Solo.io with a hint of servicemesh background.
* Full transparency * Full transparency
* Optimized networking * Optimized networking
* Lower ressource allocation * Lower resource allocation
* No race conditions * No race conditions
* No manual pod injection * No manual pod injection
* No credentials in the app * No credentials in the app
@ -68,12 +68,12 @@ Global field CTO at Solo.io with a hint of servicemesh background.
* Kubeproxy replacement * Kubeproxy replacement
* Ingress (via Gateway API) * Ingress (via Gateway API)
* Mutual Authentication * Mutual Authentication
* Specialiced CiliumNetworkPolicy * Specialized CiliumNetworkPolicy
* Configure Envoy throgh Cilium * Configure Envoy through Cilium
### Control Plane ### Control Plane
* Cilium-Agent on each node that reacts to scheduled workloads by programming the local dataplane * Cilium-Agent on each node that reacts to scheduled workloads by programming the local data-plane
* API via Gateway API and CiliumNetworkPolicy * API via Gateway API and CiliumNetworkPolicy
```mermaid ```mermaid
@ -98,29 +98,29 @@ flowchart TD
### Data plane ### Data plane
* Configured by control plane * Configured by control plane
* Does all of the eBPF things in L4 * Does all the eBPF things in L4
* Does all of the envoy things in L7 * Does all the envoy things in L7
* In-Kernel Wireguard for optional transparent encryption * In-Kernel WireGuard for optional transparent encryption
### mTLS ### mTLS
* Network Policies get applied at the eBPF layer (check if id a can talk to id 2) * Network Policies get applied at the eBPF layer (check if ID a can talk to ID 2)
* When mTLS is enabled there is a auth check in advance -> It it fails, proceed with agents * When mTLS is enabled there is an auth check in advance -> If it fails, proceed with agents
* Agents talk to each other for mTLS Auth and save the result to a cache -> Now ebpf can say yes * Talk to each other for mTLS Auth and save the result to a cache -> Now eBPF can say yes
* Problems: The caches can lead to id confusion * Problems: The caches can lead to ID confusion
## Istio ## Istio
### Basiscs ### Basics
* L4/7 Service mesh without it's own CNI * L4/7 Service mesh without its own CNI
* Based on envoy * Based on envoy
* mTLS * mTLS
* Classicly via sidecar, nowadays * Classically via sidecar, nowadays
### Ambient mode ### Ambient mode
* Seperate L4 and L7 -> Can run on cilium * Separate L4 and L7 -> Can run on cilium
* mTLS * mTLS
* Gateway API * Gateway API
@ -143,14 +143,14 @@ flowchart TD
``` ```
* Central xDS Control Plane * Central xDS Control Plane
* Per-Node Dataplane that reads updates from Control Plane * Per-Node Data-plane that reads updates from Control Plane
### Data Plane ### Data Plane
* L4 runs via zTunnel Daemonset that handels mTLS * L4 runs via zTunnel Daemonset that handles mTLS
* The zTunnel traffic get's handed over to the CNI * The zTunnel traffic gets handed over to the CNI
* L7 Proxy lives somewhere™ and traffic get's routed through it as an "extra hop" aka waypoint * L7 Proxy lives somewhere™ and traffic gets routed through it as an "extra hop" aka waypoint
### mTLS ### mTLS
* The zTunnel creates a HBONE (http overlay network) tunnel with mTLS * The zTunnel creates a HBONE (HTTP overlay network) tunnel with mTLS

View File

@ -8,17 +8,17 @@ Who have I talked to today, are there any follow-ups or learnings?
## Operator Framework ## Operator Framework
* We talked about the operator lifecycle manager * We talked about the operator lifecycle manager
* They shared the roadmap and the new release 1.0 will bring support for Operator Bundle loading from any oci source (no more public-registry enforcement) * They shared the roadmap and the new release 1.0 will bring support for Operator Bundle loading from any OCI source (no more public-registry enforcement)
## Flux ## Flux
* We talked about automatic helm release updates [lessons learned from flux](/lessons_learned/02_flux) * We talked about automatic helm release updates [lessons learned from flux](/lessons_learned/02_flux)
## Cloudfoundry/Paketo ## Cloud foundry/Paketo
* We mostly had some smalltalk * We mostly had some smalltalk
* There will be a cloudfoundry day in Karlsruhe in October, they'd be happy to have us ther * There will be a cloud foundry day in Karlsruhe in October, they'd be happy to have us there
* The whole KORFI (Cloudfoundry on Kubernetes) Project is still going strong, but no release canidate yet (or in the near future) * The whole KORFI (Cloud foundry on Kubernetes) Project is still going strong, but no release candidate yet (or in the near future)
## Traefik ## Traefik
@ -31,7 +31,7 @@ They will follow up
## Postman ## Postman
* I asked them about their new cloud-only stuff: They will keep their direction * I asked them about their new cloud-only stuff: They will keep their direction
* The are also planning to work on info materials on why postman SaaS is not a big security risk * They are also planning to work on info materials on why postman SaaS is not a big security risk
## Mattermost ## Mattermost
@ -39,9 +39,9 @@ They will follow up
I should follow up I should follow up
{{% /notice %}} {{% /notice %}}
* I talked about our problems with the mattermost operator and was asked to get back to them with the errors * I talked about our problems with the Mattermost operator and was asked to get back to them with the errors
* They're currently migrating the mattermost cloud offering to arm - therefor arm support will be coming in the next months * They're currently migrating the Mattermost cloud offering to arm - therefor arm support will be coming in the next months
* The mattermost guy had exactly the same problems with notifications and read/unread using element * The Mattermost guy had exactly the same problems with notifications and read/unread using element
## Vercel ## Vercel
@ -53,7 +53,7 @@ I should follow up
* The paid renovate offering now includes build failure estimation * The paid renovate offering now includes build failure estimation
* I was told not to buy it after telling the technical guy that we just use build pipelines as MR verification * I was told not to buy it after telling the technical guy that we just use build pipelines as MR verification
### Certmanager ### Cert manager
* The best swag (judged by coolness points) * The best swag (judged by coolness points)
@ -63,11 +63,11 @@ I should follow up
They will follow up with a quick demo They will follow up with a quick demo
{{% /notice %}} {{% /notice %}}
* A kubernetes security/runtime security solution with pretty nice looking urgency filters * A Kubernetes security/runtime security solution with pretty nice looking urgency filters
* Includes eBPF to see what code actually runs * Includes eBPF to see what code actually runs
* I'll witness a demo in early/mid april * I'll witness a demo in early/mid April
### Isovalent ### Isovalent
* Dinner (very tasty) * Dinner (very tasty)
* Cilium still sounds like the way to go in regards to CNIs * Cilium still sounds like the way to go in regard to CNIs

View File

@ -5,7 +5,7 @@ weight: 2
--- ---
Day two is also the official day one of KubeCon (Day one was just CloudNativeCon). Day two is also the official day one of KubeCon (Day one was just CloudNativeCon).
This is where all of the people joined (over 12000) This is where all the people joined (over 12000)
The opening keynotes were a mix of talks and panel discussions. The opening keynotes were a mix of talks and panel discussions.
The main topic was - who could have guessed - AI and ML. The main topic was - who could have guessed - AI and ML.