From 7b1203c7a3e4dd1b79bd0b7a169e73c046de0a4f Mon Sep 17 00:00:00 2001 From: Nicolai Ort Date: Tue, 26 Mar 2024 15:00:48 +0100 Subject: [PATCH] Day 2 typos --- .vscode/ltex.dictionary.en-US.txt | 46 ++++++++++++++++++ content/day2/01_opening.md | 12 ++--- content/day2/02_ai_keynote.md | 24 +++++----- content/day2/03_accelerating_ai_workloads.md | 22 ++++----- content/day2/04_sponsored_ai_platform.md | 8 ++-- content/day2/05_performance_sustainability.md | 12 ++--- content/day2/06_newsshow_ai_edition.md | 24 +++++----- content/day2/07_is_your_image_distroless.md | 8 ++-- content/day2/08_multicloud_saas.md | 48 +++++++++---------- content/day2/09_safety_usability_auth.md | 28 +++++------ content/day2/10_dev_ux.md | 22 ++++----- content/day2/11_sidecarless.md | 48 +++++++++---------- content/day2/99_networking.md | 24 +++++----- content/day2/_index.md | 2 +- 14 files changed, 187 insertions(+), 141 deletions(-) diff --git a/.vscode/ltex.dictionary.en-US.txt b/.vscode/ltex.dictionary.en-US.txt index 1640dde..3f86cb2 100644 --- a/.vscode/ltex.dictionary.en-US.txt +++ b/.vscode/ltex.dictionary.en-US.txt @@ -36,3 +36,49 @@ multicluster Statefulset eBPF Parca +KubeCon +FinOps +moondream +OLLAMA +LLVA +LLAVA +bokllava +NVLink +CUDA +Space-seperated +KAITO +Hugginface +LLMA +Alluxio +LLMs +onprem +Kube +Kubeflow +Ohly +distroless +init +Distroless +Buildkit +busybox +ECK +Kibana +Dedup +Crossplane +autoprovision +RBAC +Serviceaccount +CVEs +Podman +LinkerD +sidecarless +Kubeproxy +Daemonset +zTunnel +HBONE +Paketo +KORFI +Traefik +traefik +Vercel +Isovalent +CNIs diff --git a/content/day2/01_opening.md b/content/day2/01_opening.md index f5a27bf..9d2af7f 100644 --- a/content/day2/01_opening.md +++ b/content/day2/01_opening.md @@ -6,7 +6,7 @@ tags: - opening --- -The opening keynote started - as is the tradition with keynotes - with an "motivational" opening video. +The opening keynote started - as is the tradition with keynotes - with a "motivational" opening video. The keynote itself was presented by the CEO of the CNCF. ## The numbers @@ -17,7 +17,7 @@ The keynote itself was presented by the CEO of the CNCF. ## The highlights -* Everyone uses cloudnative +* Everyone uses cloud native * AI uses Kubernetes b/c the UX is way better than classic tools * Especially when transferring from dev to prod * We need standardization @@ -26,10 +26,10 @@ The keynote itself was presented by the CEO of the CNCF. ## Live demo * KIND cluster on desktop -* Protptype Stack (develop on client) +* Prototype Stack (develop on client) * Kubernetes with the LLM - * Host with LLVA (image describe model), moondream and OLLAMA (the model manager/registry() + * Host with LLAVA (image describe model), moondream and OLLAMA (the model manager/registry() * Prod Stack (All in kube) * Kubernetes with LLM, LLVA, OLLAMA, moondream -* Available Models: llava, mistral bokllava (llava*mistral) -* Host takes picture, ai describes what is pictures (in our case the conference audience) +* Available Models: LLAVA, mistral bokllava (LLAVA*mistral) +* Host takes picture, AI describes what is pictures (in our case the conference audience) diff --git a/content/day2/02_ai_keynote.md b/content/day2/02_ai_keynote.md index 1a15cd1..1b5595b 100644 --- a/content/day2/02_ai_keynote.md +++ b/content/day2/02_ai_keynote.md @@ -7,7 +7,7 @@ tags: - panel --- -A podium discussion (somewhat scripted) lead by Pryanka +A podium discussion (somewhat scripted) lead by Priyanka ## Guests @@ -17,24 +17,24 @@ A podium discussion (somewhat scripted) lead by Pryanka ## Discussion -* What do you use as the base of dev for ollama - * Jeff: The concepts from docker, git, kubernetes -* How is the balance between ai engi and ai ops - * Jeff: The classic dev vs ops devide, many ML-Engi don't think about +* What do you use as the base of dev for OLLAMA + * Jeff: The concepts from docker, git, Kubernetes +* How is the balance between AI engineer and AI ops + * Jeff: The classic dev vs ops divide, many ML-Engineer don't think about * Paige: Yessir * How does infra keep up with the fast research - * Paige: Well, they don't - but they do their best and Cloudnative is cool - * Jeff: Well we're not google, but kubernetes is the saviour + * Paige: Well, they don't - but they do their best and Cloud native is cool + * Jeff: Well we're not google, but Kubernetes is the savior * What are scaling constraints - * Jeff: Currently sizing of models is still in it's infancy + * Jeff: Currently sizing of models is still in its infancy * Jeff: There will be more specific hardware and someone will have to support it * Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization) * Paige: Optimization of smaller models * What technologies need to be open source licensed * Jeff: The model b/c access and trust - * Tim: The models and base execution environemtn -> Vendor agnosticism - * Paige: Yes and remixes are really imporant for development + * Tim: The models and base execution environment -> Vendor agnosticism + * Paige: Yes and remixes are really important for development * Anything else * Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world - * Paige: Currently many people just use paid apis to abstract the infra, but we need this stuff selfhostable - * Tim: I don'T want to know about the hardware, the whole infra side should be done by the cloudnative teams to let ML-Engi to just be ML-Engine + * Paige: Currently many people just use paid APIs to abstract the infra, but we need this stuff self-hostable + * Tim: I don't want to know about the hardware, the whole infra side should be done by the cloud native teams to let ML-Engineer to just be ML-Engine diff --git a/content/day2/03_accelerating_ai_workloads.md b/content/day2/03_accelerating_ai_workloads.md index f8cd38e..c423200 100644 --- a/content/day2/03_accelerating_ai_workloads.md +++ b/content/day2/03_accelerating_ai_workloads.md @@ -9,7 +9,7 @@ tags: Kevin and Sanjay from NVIDIA -## Enabeling GPUs in Kubernetes today +## Enabling GPUs in Kubernetes today * Host level components: Toolkit, drivers * Kubernetes components: Device plugin, feature discovery, node selector @@ -18,24 +18,24 @@ Kevin and Sanjay from NVIDIA ## GPU sharing * Time slicing: Switch around by time -* Multi Process Service: Run allways on the GPU but share (space-) +* Multi Process Service: Always run on the GPU but share (space-) * Multi Instance GPU: Space-seperated sharing on the hardware -* Virtual GPU: Virtualices Time slicing or MIG +* Virtual GPU: Virtualizes Time slicing or MIG * CUDA Streams: Run multiple kernels in a single app ## Dynamic resource allocation -* A new alpha feature since Kube 1.26 for dynamic ressource requesting -* You just request a ressource via the API and have fun +* A new alpha feature since Kube 1.26 for dynamic resource requesting +* You just request a resource via the API and have fun * The sharing itself is an implementation detail -## GPU scale out challenges +## GPU scale-out challenges * NVIDIA Picasso is a foundry for model creation powered by Kubernetes * The workload is the training workload split into batches * Challenge: Schedule multiple training jobs by different users that are prioritized -### Topology aware placments +### Topology aware placements * You need thousands of GPUs, a typical Node has 8 GPUs with fast NVLink communication - beyond that switching * Target: optimize related jobs based on GPU node distance and NUMA placement @@ -44,11 +44,11 @@ Kevin and Sanjay from NVIDIA * Stuff can break, resulting in slowdowns or errors * Challenge: Detect faults and handle them -* Observability both in-band and out ouf band that expose node conditions in kubernetes +* Observability both in-band and out of band that expose node conditions in Kubernetes * Needed: Automated fault-tolerant scheduling -### Multi-dimensional optimization +### Multidimensional optimization -* There are different KPIs: starvation, prioprity, occupanccy, fainrness -* Challenge: What to choose (the multi-dimensional decision problemn) +* There are different KPIs: starvation, priority, occupancy, fairness +* Challenge: What to choose (the multidimensional decision problem) * Needed: A scheduler that can balance the dimensions diff --git a/content/day2/04_sponsored_ai_platform.md b/content/day2/04_sponsored_ai_platform.md index eb2ff07..02d16e3 100644 --- a/content/day2/04_sponsored_ai_platform.md +++ b/content/day2/04_sponsored_ai_platform.md @@ -15,11 +15,11 @@ Jorge Palma from Microsoft with a quick introduction. * Containerized models * GPUs in the cluster (install, management) -## Kubernetes AI Toolchain (KAITO) +## Kubernetes AI Tool chain (KAITO) * Kubernetes operator that interacts with * Node provisioner * Deployment -* Simple CRD that decribes a model, infra and have fun -* Creates inferance endpoint -* Models are currently 10 (Hugginface, LLMA, etc) +* Simple CRD that describes a model, infra and have fun +* Creates inference endpoint +* Models are currently 10 (Hugginface, LLMA, etc.) diff --git a/content/day2/05_performance_sustainability.md b/content/day2/05_performance_sustainability.md index b92d13b..3c54e47 100644 --- a/content/day2/05_performance_sustainability.md +++ b/content/day2/05_performance_sustainability.md @@ -6,14 +6,14 @@ tags: - panel --- -A panel discussion with moderation by Google and participants from Google, Alluxio, Apmpere and CERN. +A panel discussion with moderation by Google and participants from Google, Alluxio, Ampere and CERN. It was pretty scripted with prepared (sponsor specific) slides for each question answered. ## Takeaways -* Deploying a ML should become the new deploy a web app -* The hardware should be fully utilized -> Better ressource sharing and scheduling -* Smaller LLMs on cpu only is preyy cost efficient -* Better scheduling by splitting into storage + cpu (prepare) and gpu (run) nodes to create a just-in-time flow +* Deploying an ML should become the new deployment a web app +* The hardware should be fully utilized -> Better resource sharing and scheduling +* Smaller LLMs on CPU only is pretty cost-efficient +* Better scheduling by splitting into storage + CPU (prepare) and GPU (run) nodes to create a just-in-time flow * Software acceleration is cool, but we should use more specialized hardware and models to run on CPUs -* We should be flexible regarding hardware, multi-cluster workloads and hybrig (onprem, burst to cloud) workloads +* We should be flexible regarding hardware, multi-cluster workloads and hybrid (onprem, burst to cloud) workloads diff --git a/content/day2/06_newsshow_ai_edition.md b/content/day2/06_newsshow_ai_edition.md index 3fa9214..b529748 100644 --- a/content/day2/06_newsshow_ai_edition.md +++ b/content/day2/06_newsshow_ai_edition.md @@ -5,41 +5,41 @@ tags: - keynote --- -Nikhita presented projects that merge CloudNative and AI. -PAtrick Ohly Joined for DRA +Nikhita presented projects that merge cloud native and AI. +Patrick Ohly Joined for DRA ### The "news" * New work group AI -* More tools are including ai features -* New updated cncf for children feat AI +* More tools are including AI features +* New updated CNCF for children feat AI * One decade of Kubernetes * DRA is in alpha ### DRA * A new API for resources (node-local and node-attached) -* Sharing of ressources between cods and containers +* Sharing of resources between cods and containers * Vendor specific stuff are abstracted by a vendor driver controller * The kube scheduler can interact with the vendor parameters for scheduling and autoscaling -### Cloudnative AI ecosystem +### Cloud native AI ecosystem * Kube is the seed for the AI infra plant * Kubeflow users wanted AI registries * LLM on the edge -* Opentelemetry bring semandtics +* OpenTelemetry bring semantics * All of these tools form a symbiosis between * Topics of discussions ### The working group AI -* It was formed in october 2023 -* They are working on the whitepaper (cloudnative and ai) wich was opublished on 19.03.2024 -* The landscape "cloudnative and ai" is WIP and will be merged into the main CNCF landscape +* It was formed in October 2023 +* They are working on the white paper (cloud native and AI) which was published on 19.03.2024 +* The landscape "cloud native and AI" is WIP and will be merged into the main CNCF landscape * The future focus will be on security and cost efficiency (with a hint of sustainability) ### LFAI and CNCF -* The direcor of the AI foundation talks abouzt ai and cloudnative -* They are looking forward to more colaboraion +* The director of the AI foundation talks about AI and cloud native +* They are looking forward to more collaboration diff --git a/content/day2/07_is_your_image_distroless.md b/content/day2/07_is_your_image_distroless.md index 4f0c219..95ae87c 100644 --- a/content/day2/07_is_your_image_distroless.md +++ b/content/day2/07_is_your_image_distroless.md @@ -14,7 +14,7 @@ The entire talk was very short, but it was a nice demo of init containers * Security is hard - distroless sounds like a nice helper * Basic Challenge: Usability-Security Dilemma -> But more usability doesn't mean less secure, but more updating * Distro: Kernel + Software Packages + Package manager (optional) -> In Containers just without the kernel -* Distroless: No package manager, no shell, no webcluent (curl/wget) - only minimal sofware bundels +* Distroless: No package manager, no shell, no web client (curl/wget) - only minimal software bundles ## Tools for distroless image creation @@ -29,13 +29,13 @@ The entire talk was very short, but it was a nice demo of init containers ## Demo -* A (rough) distroless postgres with alpine build step and scratch final step +* A (rough) distroless Postgres with alpine build step and scratch final step * A basic pg:alpine container used for init with a shared data volume -* The init uses the pg admin user to initialize the pg server (you don't need the admin creds after this) +* The init uses the pg admin user to initialize the pg server (you don't need the admin credentials after this) ### Kube -* K apply failed b/c no internet, but was fixed by connecting to wifi +* K apply failed b/c no internet, but was fixed by connecting to Wi-Fi * Without the init container the pod just crashes, with the init container the correct config gets created ### Docker compose diff --git a/content/day2/08_multicloud_saas.md b/content/day2/08_multicloud_saas.md index 91c64bb..cf21743 100644 --- a/content/day2/08_multicloud_saas.md +++ b/content/day2/08_multicloud_saas.md @@ -13,63 +13,63 @@ A talk by elastic. ## About elastic -* Elestic cloud as a managed service +* Elastic cloud as a managed service * Deployed across AWS/GCP/Azure in over 50 regions -* 600.000+ Containers +* 600000+ Containers ### Elastic and Kube -* They offer elastic obervability +* They offer elastic observability * They offer the ECK operator for simplified deployments ## The baseline -* Goal: A large scale (1M+ containers resilient platform on k8s +* Goal: A large scale (1M+ containers) resilient platform on k8s * Architecture - * Global Control: The control plane (api) for users with controllers - * Regional Apps: The "shitload" of kubernetes clusters where the actual customer services live + * Global Control: The control plane (API) for users with controllers + * Regional Apps: The "shitload" of Kubernetes clusters where the actual customer services live ## Scalability * Challenge: How large can our cluster be, how many clusters do we need * Problem: Only basic guidelines exist for that -* Decision: Horizontaly scale the number of clusters (5ßß-1K nodes each) +* Decision: Horizontally scale the number of clusters (5ßß-1K nodes each) * Decision: Disposable clusters * Throw away without data loss - * Single source of throuth is not cluster etcd but external -> No etcd backups needed + * Single source of truth is not cluster etcd but external -> No etcd backups needed * Everything can be recreated any time ## Controllers {{% notice style="note" %}} -I won't copy the explanations of operators/controllers in this notes +I won't copy the explanations of operators/controllers in these notes {{% /notice %}} -* Many different controllers, including (but not limited to) - * cluster controler: Register cluster to controller +* Many controllers, including (but not limited to) + * cluster controller: Register cluster to controller * Project controller: Schedule user's project to cluster * Product controllers (Elasticsearch, Kibana, etc.) - * Ingress/Certmanager + * Ingress/Cert manager * Sometimes controllers depend on controllers -> potential complexity * Pro: - * Resilient (Selfhealing) + * Resilient (Self-healing) * Level triggered (desired state vs procedure triggered) * Simple reasoning when comparing desired state vs state machine * Official controller runtime lib -* Workque: Automatic Dedup, Retry backoff and so on +* Workqueue: Automatic Dedup, Retry back off and so on ## Global Controllers * Basic operation * Uses project config from Elastic cloud as the desired state - * The actual state is a k9s ressource in another cluster -* Challenge: Where is the source of thruth if the data is not stored in etc -* Solution: External datastore (postgres) -* Challenge: How do we sync the db sources to kubernetes + * The actual state is a k9s resource in another cluster +* Challenge: Where is the source of truth if the data is not stored in etcd +* Solution: External data store (Postgres) +* Challenge: How do we sync the db sources to Kubernetes * Potential solutions: Replace etcd with the external db * Chosen solution: - * The controllers don't use CRDs for storage, but they expose a webapi - * Reconciliation still now interacts with the external db and go channels (que) instead + * The controllers don't use CRDs for storage, but they expose a web-API + * Reconciliation still now interacts with the external db and go channels (queue) instead * Then the CRs for the operators get created by the global controller ### Large scale @@ -82,10 +82,10 @@ I won't copy the explanations of operators/controllers in this notes ### Reconcile * User-driven events are processed asap -* reconcole of everything should happen, bus with low prio slowly in the background -* Solution: Status: LastReconciledRevision (timestamp) get's compare to revision, if larger -> User change -* Prioritization: Just a custom event handler with the normal queue and a low prio -* Low Prio Queue: Just a queue that adds items to the normal work-queue with a rate limit +* reconcile of everything should happen, bus with low priority slowly in the background +* Solution: Status: LastReconciledRevision (timestamp) gets compare to revision, if larger -> User change +* Prioritization: Just a custom event handler with the normal queue and a low priority +* Queue: Just a queue that adds items to the normal work-queue with a rate limit ```mermaid flowchart LR diff --git a/content/day2/09_safety_usability_auth.md b/content/day2/09_safety_usability_auth.md index 0752a30..abeb18e 100644 --- a/content/day2/09_safety_usability_auth.md +++ b/content/day2/09_safety_usability_auth.md @@ -6,39 +6,39 @@ tags: - security --- -A talk by Google and Microsoft with the premise of bether auth in k8s. +A talk by Google and Microsoft with the premise of better auth in k8s. ## Baselines * Most access controllers have read access to all secrets -> They are not really designed for keeping these secrets * Result: CVEs -* Example: Just use ingress, nginx, put in some lua code in the config and voila: Service account token +* Example: Just use ingress, nginx, put in some Lua code in the config and e voilà: Service account token * Fix: No more fun ## Basic solutions -* Seperate Control (the controller) from data (the ingress) +* Separate Control (the controller) from data (the ingress) * Namespace limited ingress ## Current state of cross namespace stuff -* Why: Reference tls cert for gateway api in the cert team'snamespace +* Why: Reference TLS cert for gateway API in the cert team's namespace * Why: Move all ingress configs to one namespace * Classic Solution: Annotations in contour that references a namespace that contains all certs (rewrites secret to certs/secret) * Gateway Solution: * Gateway TLS secret ref includes a namespace - * ReferenceGrant pretty mutch allows referencing from X (Gatway) to Y (Secret) + * ReferenceGrant pretty much allows referencing from X (Gateway) to Y (Secret) * Limits: * Has to be implemented via controllers - * The controllers still have readall - they just check if they are supposed to do this + * The controllers still have read all - they just check if they are supposed to do this ## Goals ### Global -* Grant access to controller to only ressources relevant for them (using references and maybe class segmentation) +* Grant access to controller to only resources relevant for them (using references and maybe class segmentation) * Allow for safe cross namespace references -* Make it easy for api devs to adopt it +* Make it easy for API devs to adopt it ### Personas @@ -50,20 +50,20 @@ A talk by Google and Microsoft with the premise of bether auth in k8s. * Alex: Define relationships via ReferencePatterns * Kai: Specify controller identity (Serviceaccount), define relationship API -* Rohan: Define cross namespace references (aka ressource grants that allow access to their ressources) +* Rohan: Define cross namespace references (aka resource grants that allow access to their resources) ## Result of the paper ### Architecture * ReferencePattern: Where do i find the references -> example: GatewayClass in the gateway API -* ReferenceConsumer: Who (IOdentity) has access under which conditions? +* ReferenceConsumer: Who (Identity) has access under which conditions? * ReferenceGrant: Allow specific references ### POC * Minimum access: You only get access if the grant is there AND the reference actually exists -* Their basic implementation works with the kube api +* Their basic implementation works with the kube API ### Open questions @@ -74,9 +74,9 @@ A talk by Google and Microsoft with the premise of bether auth in k8s. ## Alternative -* Idea: Just extend RBAC Roles with a selector (match labels, etc) +* Idea: Just extend RBAC Roles with a selector (match labels, etc.) * Problems: - * Requires changes to kubernetes core auth + * Requires changes to Kubernetes core auth * Everything bus list and watch is a pain * How do you handle AND vs OR selection * Field selectors: They exist @@ -84,5 +84,5 @@ A talk by Google and Microsoft with the premise of bether auth in k8s. ## Meanwhile -* Prefer tools that support isolatiobn between controller and dataplane +* Prefer tools that support isolation between controller and data-plane * Disable all non-needed features -> Especially scripting diff --git a/content/day2/10_dev_ux.md b/content/day2/10_dev_ux.md index 83f4050..51f3c20 100644 --- a/content/day2/10_dev_ux.md +++ b/content/day2/10_dev_ux.md @@ -6,32 +6,32 @@ tags: - dx --- -A talk by UX and software people at RedHat (Podman team). -The talk mainly followed the academic study process (aka this is the survey I did for my bachelors/masters thesis). +A talk by UX and software people at Red Hat (Podman team). +The talk mainly followed the academic study process (aka this is the survey I did for my bachelor's/master's thesis). ## Research * User research Study including 11 devs and platform engineers over three months -* Focus was on an new podman desktop feature -* Experence range 2-3 years experience average (low no experience, high oldschool kube) +* Focus was on a new Podman desktop feature +* Experience range 2-3 years experience average (low no experience, high old school kube) * 16 questions regarding environment, workflow, debugging and pain points * Analysis: Affinity mapping ## Findings * Where do I start when things are broken? -> There may be solutions, but devs don't know about them -* Network debugging is hard b/c many layers and problems occuring in between cni and infra are really hard -> Network topology issues are rare but hard -* YAML identation -> Tool support is needed for visualisation -* YAML validation -> Just use validation in dev and gitops -* YAML Cleanup -> Normalize YAML (order, anchors, etc) for easy diff -* Inadequate security analysis (too verbose, non-issues are warnings) -> Realtime insights (and during dev) +* Network debugging is hard b/c many layers and problems occurring in between CNI and infra are really hard -> Network topology issues are rare but hard +* YAML indentation -> Tool support is needed for visualization +* YAML validation -> Just use validation in dev and GitOps +* YAML Cleanup -> Normalize YAML (order, anchors, etc.) for easy diff +* Inadequate security analysis (too verbose, non-issues are warnings) -> Real-time insights (and during dev) * Crash Loop -> Identify stuck containers, simple debug containers -* CLI vs GUI -> Enable eperience level oriented gui, Enhance intime troubleshooting +* CLI vs GUI -> Enable experience level oriented GUI, Enhance in-time troubleshooting ## General issues * No direct fs access * Multiple kubeconfigs * SaaS is sometimes only provided on kube, which sounds like complexity -* Where do i begin my troubleshooting +* Where do I begin my troubleshooting * Interoperability/Fragility with updates diff --git a/content/day2/11_sidecarless.md b/content/day2/11_sidecarless.md index bb0afdf..7c04aa2 100644 --- a/content/day2/11_sidecarless.md +++ b/content/day2/11_sidecarless.md @@ -6,11 +6,11 @@ tags: - network --- -Global field CTO at Solo.io with a hint of servicemesh background. +Global field CTO at Solo.io with a hint of service mesh background. ## History -* LinkerD 1.X was the first moder servicemesh and basicly a opt-in serviceproxy +* LinkerD 1.X was the first modern service mesh and basically an opt-in service proxy * Challenges: JVM (size), latencies, ... ### Why not node-proxy? @@ -23,8 +23,8 @@ Global field CTO at Solo.io with a hint of servicemesh background. ### Why sidecar? * Transparent (ish) -* PArt of app lifecycle (up/down) -* Single tennant +* Part of app lifecycle (up/down) +* Single tenant * No noisy neighbor ### Sidecar drawbacks @@ -46,7 +46,7 @@ Global field CTO at Solo.io with a hint of servicemesh background. * Full transparency * Optimized networking -* Lower ressource allocation +* Lower resource allocation * No race conditions * No manual pod injection * No credentials in the app @@ -68,12 +68,12 @@ Global field CTO at Solo.io with a hint of servicemesh background. * Kubeproxy replacement * Ingress (via Gateway API) * Mutual Authentication -* Specialiced CiliumNetworkPolicy -* Configure Envoy throgh Cilium +* Specialized CiliumNetworkPolicy +* Configure Envoy through Cilium ### Control Plane -* Cilium-Agent on each node that reacts to scheduled workloads by programming the local dataplane +* Cilium-Agent on each node that reacts to scheduled workloads by programming the local data-plane * API via Gateway API and CiliumNetworkPolicy ```mermaid @@ -98,29 +98,29 @@ flowchart TD ### Data plane * Configured by control plane -* Does all of the eBPF things in L4 -* Does all of the envoy things in L7 -* In-Kernel Wireguard for optional transparent encryption +* Does all the eBPF things in L4 +* Does all the envoy things in L7 +* In-Kernel WireGuard for optional transparent encryption ### mTLS -* Network Policies get applied at the eBPF layer (check if id a can talk to id 2) -* When mTLS is enabled there is a auth check in advance -> It it fails, proceed with agents -* Agents talk to each other for mTLS Auth and save the result to a cache -> Now ebpf can say yes -* Problems: The caches can lead to id confusion +* Network Policies get applied at the eBPF layer (check if ID a can talk to ID 2) +* When mTLS is enabled there is an auth check in advance -> If it fails, proceed with agents +* Talk to each other for mTLS Auth and save the result to a cache -> Now eBPF can say yes +* Problems: The caches can lead to ID confusion ## Istio -### Basiscs +### Basics -* L4/7 Service mesh without it's own CNI +* L4/7 Service mesh without its own CNI * Based on envoy * mTLS -* Classicly via sidecar, nowadays +* Classically via sidecar, nowadays ### Ambient mode -* Seperate L4 and L7 -> Can run on cilium +* Separate L4 and L7 -> Can run on cilium * mTLS * Gateway API @@ -143,14 +143,14 @@ flowchart TD ``` * Central xDS Control Plane -* Per-Node Dataplane that reads updates from Control Plane +* Per-Node Data-plane that reads updates from Control Plane ### Data Plane -* L4 runs via zTunnel Daemonset that handels mTLS -* The zTunnel traffic get's handed over to the CNI -* L7 Proxy lives somewhere™ and traffic get's routed through it as an "extra hop" aka waypoint +* L4 runs via zTunnel Daemonset that handles mTLS +* The zTunnel traffic gets handed over to the CNI +* L7 Proxy lives somewhere™ and traffic gets routed through it as an "extra hop" aka waypoint ### mTLS -* The zTunnel creates a HBONE (http overlay network) tunnel with mTLS +* The zTunnel creates a HBONE (HTTP overlay network) tunnel with mTLS diff --git a/content/day2/99_networking.md b/content/day2/99_networking.md index 671b770..d280ed2 100644 --- a/content/day2/99_networking.md +++ b/content/day2/99_networking.md @@ -8,17 +8,17 @@ Who have I talked to today, are there any follow-ups or learnings? ## Operator Framework * We talked about the operator lifecycle manager -* They shared the roadmap and the new release 1.0 will bring support for Operator Bundle loading from any oci source (no more public-registry enforcement) +* They shared the roadmap and the new release 1.0 will bring support for Operator Bundle loading from any OCI source (no more public-registry enforcement) ## Flux * We talked about automatic helm release updates [lessons learned from flux](/lessons_learned/02_flux) -## Cloudfoundry/Paketo +## Cloud foundry/Paketo * We mostly had some smalltalk -* There will be a cloudfoundry day in Karlsruhe in October, they'd be happy to have us ther -* The whole KORFI (Cloudfoundry on Kubernetes) Project is still going strong, but no release canidate yet (or in the near future) +* There will be a cloud foundry day in Karlsruhe in October, they'd be happy to have us there +* The whole KORFI (Cloud foundry on Kubernetes) Project is still going strong, but no release candidate yet (or in the near future) ## Traefik @@ -31,7 +31,7 @@ They will follow up ## Postman * I asked them about their new cloud-only stuff: They will keep their direction -* The are also planning to work on info materials on why postman SaaS is not a big security risk +* They are also planning to work on info materials on why postman SaaS is not a big security risk ## Mattermost @@ -39,9 +39,9 @@ They will follow up I should follow up {{% /notice %}} -* I talked about our problems with the mattermost operator and was asked to get back to them with the errors -* They're currently migrating the mattermost cloud offering to arm - therefor arm support will be coming in the next months -* The mattermost guy had exactly the same problems with notifications and read/unread using element +* I talked about our problems with the Mattermost operator and was asked to get back to them with the errors +* They're currently migrating the Mattermost cloud offering to arm - therefor arm support will be coming in the next months +* The Mattermost guy had exactly the same problems with notifications and read/unread using element ## Vercel @@ -53,7 +53,7 @@ I should follow up * The paid renovate offering now includes build failure estimation * I was told not to buy it after telling the technical guy that we just use build pipelines as MR verification -### Certmanager +### Cert manager * The best swag (judged by coolness points) @@ -63,11 +63,11 @@ I should follow up They will follow up with a quick demo {{% /notice %}} -* A kubernetes security/runtime security solution with pretty nice looking urgency filters +* A Kubernetes security/runtime security solution with pretty nice looking urgency filters * Includes eBPF to see what code actually runs -* I'll witness a demo in early/mid april +* I'll witness a demo in early/mid April ### Isovalent * Dinner (very tasty) -* Cilium still sounds like the way to go in regards to CNIs +* Cilium still sounds like the way to go in regard to CNIs diff --git a/content/day2/_index.md b/content/day2/_index.md index d23c1a6..a264399 100644 --- a/content/day2/_index.md +++ b/content/day2/_index.md @@ -5,7 +5,7 @@ weight: 2 --- Day two is also the official day one of KubeCon (Day one was just CloudNativeCon). -This is where all of the people joined (over 12000) +This is where all the people joined (over 12000) The opening keynotes were a mix of talks and panel discussions. The main topic was - who could have guessed - AI and ML.