First batch of day one
This commit is contained in:
10
content/_index.md
Normal file
10
content/_index.md
Normal file
@@ -0,0 +1,10 @@
|
||||
---
|
||||
archetype: home
|
||||
title: Kubecon 2024
|
||||
---
|
||||
|
||||
All about the things I did and sessions I attended at Kubecon 2024.
|
||||
|
||||
## Style Guide
|
||||
|
||||
The basic structure is as follows: `day/event-or-session`.
|
||||
3
content/_template/01_opening.md
Normal file
3
content/_template/01_opening.md
Normal file
@@ -0,0 +1,3 @@
|
||||
---
|
||||
title: Opening Keynotes
|
||||
---
|
||||
4
content/_template/_index.md
Normal file
4
content/_template/_index.md
Normal file
@@ -0,0 +1,4 @@
|
||||
---
|
||||
archetype: chapter
|
||||
title: template
|
||||
---
|
||||
7
content/day1/01_opening.md
Normal file
7
content/day1/01_opening.md
Normal file
@@ -0,0 +1,7 @@
|
||||
---
|
||||
title: Opening Keynotes
|
||||
weight: 1
|
||||
---
|
||||
|
||||
The first "event" of the day was - as always - the opening keynote.
|
||||
Today presented by Redhat and Syntasso.
|
||||
42
content/day1/02_sometimes_lipstick_is_what_a_pig_needs.md
Normal file
42
content/day1/02_sometimes_lipstick_is_what_a_pig_needs.md
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
title: Sometimes lipstick is exactly what a pig needs
|
||||
weight: 2
|
||||
---
|
||||
|
||||
By VMware (of all people) - kinda funny that they chose this title with the wole Broadcom fun.
|
||||
The main topic of this talk is: What interface do we choose for what capability.
|
||||
|
||||
## Personas
|
||||
|
||||
* Experts: Kubernetes, DB Engee
|
||||
* Users: Employees that just want to do stuff
|
||||
* Platform Engeneers: Connect Users to Services by Experts
|
||||
|
||||
## Goal
|
||||
|
||||
* Create Interfaces
|
||||
* Interface: Connect Users to Services
|
||||
* Problem: Many diferent types of Interfaces (SaaS, GUI, CLI) with different capabilities
|
||||
|
||||
## Dimensions
|
||||
|
||||
> These are the dimensions of interface design proposed in the talk
|
||||
|
||||
* Autonomy: external dependency (low) <-> self-service (high)
|
||||
* low: Ticket system -> But sometimes good for getting an expert
|
||||
* high: Portal -> Nice, but somethimes we just need a
|
||||
* Contextual distance: stay in the same tool (low) <-> switch tools (high)
|
||||
* low: IDE plugin -> High potential friction if stuff goes wrong/complex (context switch needed)
|
||||
* high: Wiki or ticketing system
|
||||
* Capability skill: anyone can do it (low) <-> Made for experts (high)
|
||||
* low: transparent sidecar (eg vuln scanner)
|
||||
* high: cli
|
||||
* Interface skill: anyone can do it (low) <-> needs specialized interface skills (high)
|
||||
* low: Documentation in web aka wiki-style
|
||||
* high: Code templates (a sample helm values.yaml or raw terraform provider)
|
||||
|
||||
## Recap
|
||||
|
||||
* You can use multiple interfaces for one capability
|
||||
* APIs (proverbial pig) are the most important interface b/c it can provide the baseline for all other interfaces
|
||||
* The beautification (lipstick) of the API through other interfaces makes uers happy
|
||||
61
content/day1/03_beyond_platform_thinking.md
Normal file
61
content/day1/03_beyond_platform_thinking.md
Normal file
@@ -0,0 +1,61 @@
|
||||
---
|
||||
title: beyond platform thinking at ritchie brothers
|
||||
weight: 3
|
||||
---
|
||||
|
||||
The story of how Thoughtworks buit YY at Ritchie Bros (RB).
|
||||
Presented by the implementers at Thoughtworks (TW).
|
||||
|
||||
## Backgroud
|
||||
|
||||
* RB is a auctioneer in the field of heavy machinery
|
||||
* Problem: They are old(ish) and own a bunch of other companies -> Duplicate Solutions
|
||||
* Goals
|
||||
* Get rid of duplicates
|
||||
* Scale without the need of more personel
|
||||
|
||||
### Platform creation principles
|
||||
|
||||
* Platform is a product
|
||||
* Building is a exercise in software eng. not operations
|
||||
* Reduce dev friction
|
||||
|
||||
## Platform overview
|
||||
|
||||
* Platform provides selfservices
|
||||
* Teams manage everything inside their namespace themselfes
|
||||
* Multiple global locations that can be opted-in and -out
|
||||
|
||||
## Principles and Solutions
|
||||
|
||||
### Compliance at source of change
|
||||
|
||||
> Developers own their pipelines
|
||||
|
||||
* Dev teams are responsible for scanning, etc
|
||||
* Platform verifies thath the compliance scans have been done (through admission control)
|
||||
* Examples:
|
||||
* OPA + Gatekeeper for admission -> Teams use snyk for scanning and admission checks the scan results
|
||||
* ira as admission hook for approval -> PO approves in Jira, admission only acceps if webhook is approved
|
||||
|
||||
### Platform Operators
|
||||
|
||||
* Implemented: S3 Operator, IAM Operator, DynamoDB Operatopr
|
||||
* Reasons:
|
||||
* Devs should not need access to AWS/GCP directly
|
||||
* Teams have full control while not needing to submit tickets or write terraform
|
||||
* Goals
|
||||
* Abstract specific details away
|
||||
* Make the results cloud-portable (AWS, GCP, Azure)
|
||||
* Still retain developer transparency
|
||||
* Example: DynamoDB Database
|
||||
1. User: creates dynamo CR and ServiceRole CR
|
||||
1. K8S: Create Pods, Secrets, Configs and Serviceaccount (related to a IAM Role)
|
||||
1. User: Creates S3 Bucket CR and assignes ServiceRole
|
||||
1. K8s: Injects secrets and configs where needed
|
||||
|
||||
### Observability
|
||||
|
||||
* Tool: Honeycomb
|
||||
* Metrics: Opentelemetry
|
||||
* Operator reconcile steps are exposed as traces
|
||||
21
content/day1/04_user_friendsly_devplatform.md
Normal file
21
content/day1/04_user_friendsly_devplatform.md
Normal file
@@ -0,0 +1,21 @@
|
||||
---
|
||||
title: User friendly Developer Platforms
|
||||
weight: 4
|
||||
---
|
||||
|
||||
This talk was by a New York Times software developer.
|
||||
No real value
|
||||
|
||||
|
||||
## Baseline
|
||||
|
||||
* How do we build composable components
|
||||
* Workflow of a new service: Create/Onboard -> Develop -> Build/Test/deploy (CI/CD) -> Run (Runtime/Cloud) -> Route (Ingress)
|
||||
|
||||
|
||||
## What do we need
|
||||
|
||||
* User documentation
|
||||
* Adoption & Patnership
|
||||
* Platform as a Product
|
||||
* Customer feedback
|
||||
38
content/day1/05_multitennancy.md
Normal file
38
content/day1/05_multitennancy.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
title: Multi Tannancy - Micro Clusters
|
||||
weight: 5
|
||||
---
|
||||
|
||||
Part of the Multitannancy Con presented by Adobe
|
||||
|
||||
## Challenges
|
||||
|
||||
* Spin up Edge Infra globally fast
|
||||
|
||||
## Implementation
|
||||
|
||||
### First try - Single Tenant Cluster
|
||||
|
||||
* Azure in Base - AWS on the edge
|
||||
* Single Tenant Clusters (Simpler Governance)
|
||||
* Responsibility is Shared between App and Platform (Monitoring, Ingress, etc)
|
||||
* Problem: Huge manual investment and overprovisioning
|
||||
* Result: Access Control to tenant Namespaces and Capacity Planning -> Pretty much a multi tenant cluster with one tenant per cluster
|
||||
|
||||
### Second Try - Microcluster
|
||||
|
||||
* One Cluster per Service
|
||||
|
||||
### Third Try - Multitennancy
|
||||
|
||||
* Use a bunch of components deployed by platform Team (Ingress, CD/CD, Monitoring, ...)
|
||||
* Harmonized general Runtime (cloud agnostic): Codenamed Ethos -> OVer 300 Clusters
|
||||
* Both shared clusters (shared by namespace) and dedicated clusters
|
||||
* Cluster config is a basic json with name, capacity, teams
|
||||
* Capacity Managment get's Monitored using Prometheus
|
||||
* Cluster Changes should be non-desruptive -> K8S-Shredder
|
||||
* Cost efficiency: Use good PDBs and livelyness/readyness Probes alongside ressource requests and limits
|
||||
|
||||
## Conclusion
|
||||
|
||||
* There is a balance between cost, customization, setup and security between single-tenant und multi-tenant
|
||||
45
content/day1/06_lightning_talks.md
Normal file
45
content/day1/06_lightning_talks.md
Normal file
@@ -0,0 +1,45 @@
|
||||
---
|
||||
title: Lightning talks
|
||||
weight: 6
|
||||
---
|
||||
|
||||
The lightning talks are 10-minute talks by diferent cncf projects.
|
||||
|
||||
## Building contaienrs at scale using buildpacks
|
||||
|
||||
A Project lightning talk by heroku and the cncf buildpacks.
|
||||
|
||||
### How and why buildpacks?
|
||||
|
||||
* What: A simple way to build reproducible contaienr images
|
||||
* Why: Scale, Reuse, Rebase
|
||||
* Rebase: Buildpacks are structured as layers
|
||||
* Dependencies, app builds and the runtime are seperated -> Easy update
|
||||
* How: Use the PAck CLI `pack build <image>` `docker run <image>`
|
||||
|
||||
## Konveyor
|
||||
|
||||
A Platform for migration of legacy apps to cloudnative platforms.
|
||||
|
||||
* Parts: Hub, Analysis (with langugage server), Assesment
|
||||
* Roadmap: Multi language support, GenAI, Asset Generation (e.g. Kube Deployments)
|
||||
|
||||
## Argo'S Communuty Driven Development
|
||||
|
||||
Pretty mutch a short intropduction to Argo Project
|
||||
|
||||
* Project Parts: Workflows (CI), Events, CD, Rollouts
|
||||
* NPS: Net Promoter Score (How likely are you to recoomend this) -> Everyone loves argo (based on their survey)
|
||||
* Rollouts: Can be based with prometheus metrics
|
||||
|
||||
## Flux
|
||||
|
||||
* Components: Helm, Kustomize, Terrafrorm, ...
|
||||
* Flagger Now supports gateway api, prometheus, datadog and more
|
||||
* New Releases
|
||||
|
||||
## A quick logg at the TAG App-Delivery
|
||||
|
||||
* Mission: Everything related to cloud-native application delivery
|
||||
* Bi-Weekly Meetings
|
||||
* Subgroup: Platforms
|
||||
63
content/day1/07_hitchhikers_guid_to_platform.md
Normal file
63
content/day1/07_hitchhikers_guid_to_platform.md
Normal file
@@ -0,0 +1,63 @@
|
||||
---
|
||||
title: Hichhikers gzude
|
||||
weight: 7
|
||||
---
|
||||
|
||||
This talks looks at bootstrapping Platforms using KSere.
|
||||
They do this in regards to AI Workflows.
|
||||
|
||||
## Szenario
|
||||
|
||||
* Deploy AI Workloads - Sometime consiting of different parts
|
||||
* Models get stored in a model registry
|
||||
|
||||
## Baseline
|
||||
|
||||
* Consistent APIs throughout the platform
|
||||
* Not the kube api directly b/c:
|
||||
* Data scientists are a bit overpowered by the kube api
|
||||
* Not only Kubernetes (also monitoring tools, feedback tools, etc)
|
||||
* Better debugging experience for specific workloads
|
||||
|
||||
## The debugging api
|
||||
|
||||
* Specific API with enhanced statuses and consistent UX across Code and UI
|
||||
* Exampüle Endpoints: Pods, Deployments, InferenceServices
|
||||
* Provides a status summary-> Consistent health info across all related ressources
|
||||
* Example: Deployments have progress/availability, Pods have phases, Containers have readyness -> What do we interpret how?
|
||||
* Evaluation: Progressing, Available Count vs Readyness, Replicafailure, Pod Phase, Container Readyness
|
||||
* The rules themselfes may be pretty complex, but - since the user doesn't have to check them themselves - the status is simple
|
||||
|
||||
### Debugging Metrics
|
||||
|
||||
* Dashboards (Utilization, throughput, latency)
|
||||
* Events
|
||||
* Logs
|
||||
|
||||
## Deployment API
|
||||
|
||||
* Launchpad: Just select your model and version -> The DB (dock) stores all manifests (Spaceship)
|
||||
* Manifests relate to models from a model registry
|
||||
* Multi-tenancy is implemented using k8s namespaces
|
||||
* Kine is used to replace/extend etcd with the relational dock db -> Relation namespace<->manifests is stored here and RBAC can be used
|
||||
* Launchpad: Select Namespace and check resource (fuel) availability/utilization
|
||||
|
||||
### Clsuter maintainance
|
||||
|
||||
* Deplyoments can be launched to multiple clusters (even two clusters at once) -> HA through identical clusters
|
||||
* The excact same manifests get deployed to two clusters
|
||||
* Cluster desired state is stored externally to enable effortless upogrades, rescale, etc
|
||||
|
||||
### Versioning API
|
||||
|
||||
* Basicly the dock DB
|
||||
* CRDs are the representations of the inference manifests
|
||||
* Rollbacks, Promotion and History is managed via the CRs
|
||||
* Why not GitOps: Internal Diffs, deployment overrides, customized features
|
||||
|
||||
### UX
|
||||
|
||||
* User driven API design
|
||||
* Customized tools
|
||||
* Everything gets 1:1 replicated for HA
|
||||
* Large onboarding guide
|
||||
66
content/day1/08_scaling_pg.md
Normal file
66
content/day1/08_scaling_pg.md
Normal file
@@ -0,0 +1,66 @@
|
||||
---
|
||||
title: Scaling Postgres using CloudNativePG
|
||||
---
|
||||
|
||||
A short Talk as Part of the DOK day - presendet by the VP of CloudNative at EDB (one of the biggest PG contributors)
|
||||
Stated target: Make the world your single point of failure
|
||||
|
||||
## Proposal
|
||||
|
||||
* Get rid of Vendor-Lockin using the oss projects PG, K8S and CnPG
|
||||
* PG was the DB of the year 2023 and a bunch of other times in the past
|
||||
* CnPG is a Level 5 mature operator
|
||||
|
||||
## 4 Pillars
|
||||
|
||||
* Seamless KubeAPI Integration (Operator PAttern)
|
||||
* Advanced observability (Prometheus Exporter, JSON logging)
|
||||
* Declarative Config (Deploy, Scale, Maintain)
|
||||
* Secure by default (Robust contaienrs, mTLS, and so on)
|
||||
|
||||
## Clusters
|
||||
|
||||
* Basic Ressource that defines name, instances, snyc and storage (and other params that have same defaults)
|
||||
* Implementation: Operator creates:
|
||||
* The volumes (PG_Data, WAL (Write ahead log)
|
||||
* Primary and Read-Write Service
|
||||
* Replicas
|
||||
* Read-Only Service (points at replicas)
|
||||
* Failover:
|
||||
* Failure detected
|
||||
* Stop R/W Service
|
||||
* Promote Replica
|
||||
* Activat R/W Service
|
||||
* Kill old promary and demote to replica
|
||||
|
||||
## Backup/Recovery
|
||||
|
||||
* Continuos Backup: Write Ahead Log Backup to object store
|
||||
* Physical: Create from primary or standby to object store or kube volumes
|
||||
* Recovery: Copy full backup and apply WAL until target (last transactio or specific timestamp) is reached
|
||||
* Replica Cluster: Basicly recreates a new cluster to a full recovery but keeps the cluster in Read-Only Replica Mode
|
||||
* Planned: Backup Plugin Interface
|
||||
|
||||
## Multi-Cluster
|
||||
|
||||
* Just create a replica cluster via WAL-files from S3 on another kube cluster (lags 5 mins behind)
|
||||
* You can also activate replication streaming
|
||||
|
||||
## Reccomended architecutre
|
||||
|
||||
* Dev Cluster: 1 Instance without PDB and with Continuos backup
|
||||
* Prod: 3 Nodes with automatic failover and continuos backups
|
||||
* Symmetric: Two clusters
|
||||
* Primary: 3-Node Cluster
|
||||
* Secondary: WAL-Based 3-Node Cluster with a designated primary (to take over if primary cluster fails)
|
||||
* Symmetric Streaming: Same as Secondary, but you manually enable the streaming api for live replication
|
||||
* Cascading Replication: Scale Symmetric to more clusters
|
||||
* Single availability zone: Well, do your best to spread to nodes and aspire to streched kubernetes to more AZs
|
||||
|
||||
## Roadmap
|
||||
|
||||
* Replica Cluster (Symmetric) Switchover
|
||||
* Synchronous Symmetric
|
||||
* 3rd PArty Plugins
|
||||
* Manage DBs via the Operator
|
||||
* Storage Autoscaling
|
||||
41
content/day1/09_serverless.md
Normal file
41
content/day1/09_serverless.md
Normal file
@@ -0,0 +1,41 @@
|
||||
---
|
||||
title: The power of serverless with Knative, Crossplane, Dapr, Keda, Shipwright and friends
|
||||
---
|
||||
|
||||
> When I say serverless I don't mean lambda - I mean serverless
|
||||
> That is thousands of lines of yaml - but I don't want to depress you
|
||||
> It will be eventually done
|
||||
> Imagine this error is not happening
|
||||
> Just imagine how I did this last night
|
||||
|
||||
## Goal
|
||||
|
||||
* Take my sourcecode and run it, scale it - jsut don't ask me
|
||||
|
||||
## Baseline
|
||||
|
||||
* Use Kubernetes for platform
|
||||
* Use kNative for autoscaling
|
||||
* Use Kaniko/Shipwright for building
|
||||
* Use Dupr for inter-service Communication
|
||||
|
||||
## Openfunction
|
||||
|
||||
> The glue between different tools to achive serverless
|
||||
|
||||
* CRD that describes:
|
||||
* Build this image and push it to the registry
|
||||
* Use this builder to build my project
|
||||
* This in my Repo
|
||||
* My App listens on this port
|
||||
* Annotations
|
||||
|
||||
## Dependencies
|
||||
|
||||
* Open Questions
|
||||
* Where are the serverless servers -> Cluster, dependencies, secrets
|
||||
* How do I create DBs, etc
|
||||
* Resulting needs
|
||||
* Cluster aaS (using crossplane - in this case using aws)
|
||||
* DBaaS (using crossplane - again usig pq on aws)
|
||||
* App aaS
|
||||
10
content/day1/_index.md
Normal file
10
content/day1/_index.md
Normal file
@@ -0,0 +1,10 @@
|
||||
---
|
||||
archetype: chapter
|
||||
title: Day 1
|
||||
weight: 1
|
||||
---
|
||||
|
||||
Day one is the Day for co-located events aka CloudNativeCon.
|
||||
I spent most of the day attending the Platform Engineering Day - as one might have guessed it's all about platform engineering.
|
||||
|
||||
Everything started with badge pickup - a very smooth experence (but that may be related to me showing up an hour or so too early).
|
||||
7
content/lessons_learned/01_operators.md
Normal file
7
content/lessons_learned/01_operators.md
Normal file
@@ -0,0 +1,7 @@
|
||||
---
|
||||
title: Operators
|
||||
---
|
||||
|
||||
## Observability
|
||||
|
||||
* Export reconcile loop steps as opentelemetry traces
|
||||
7
content/lessons_learned/_index.md
Normal file
7
content/lessons_learned/_index.md
Normal file
@@ -0,0 +1,7 @@
|
||||
---
|
||||
archetype: chapter
|
||||
title: Lessons Learned
|
||||
weight: 99
|
||||
---
|
||||
|
||||
Interesting lessons learned + tipps/tricks.
|
||||
Reference in New Issue
Block a user