First batch of day one

2024-03-19 14:59:45 +01:00
commit 8a45c2a0a5
820 changed files with 93341 additions and 0 deletions
--- a/content/_index.md
+++ b/content/_index.md
@@ -0,0 +1,10 @@
+---
+archetype: home
+title: Kubecon 2024
+---
+
+All about the things I did and sessions I attended at Kubecon 2024.
+
+## Style Guide
+
+The basic structure is as follows: `day/event-or-session`.
--- a/content/_template/01_opening.md
+++ b/content/_template/01_opening.md
@@ -0,0 +1,3 @@
+---
+title: Opening Keynotes
+---
--- a/content/_template/_index.md
+++ b/content/_template/_index.md
@@ -0,0 +1,4 @@
+---
+archetype: chapter 
+title: template
+---
--- a/content/day1/01_opening.md
+++ b/content/day1/01_opening.md
@@ -0,0 +1,7 @@
+---
+title: Opening Keynotes
+weight: 1
+---
+
+The first "event" of the day was - as always - the opening keynote.
+Today presented by Redhat and Syntasso.
--- a/content/day1/02_sometimes_lipstick_is_what_a_pig_needs.md
+++ b/content/day1/02_sometimes_lipstick_is_what_a_pig_needs.md
@@ -0,0 +1,42 @@
+---
+title: Sometimes lipstick is exactly what a pig needs
+weight: 2
+---
+
+By VMware (of all people) - kinda funny that they chose this title with the wole Broadcom fun.
+The main topic of this talk is: What interface do we choose for what capability.
+
+## Personas
+
+* Experts: Kubernetes, DB Engee
+* Users: Employees that just want to do stuff
+* Platform Engeneers: Connect Users to Services by Experts
+
+## Goal
+
+* Create Interfaces
+* Interface: Connect Users to Services
+* Problem: Many diferent types of Interfaces (SaaS, GUI, CLI) with different capabilities
+
+## Dimensions
+
+> These are the dimensions of interface design proposed in the talk
+
+* Autonomy: external dependency (low) <->  self-service (high)
+  * low: Ticket system -> But sometimes good for getting an expert
+  * high: Portal -> Nice, but somethimes we just need a 
+* Contextual distance: stay in the same tool (low) <-> switch tools (high)
+  * low: IDE plugin -> High potential friction if stuff goes wrong/complex (context switch needed)
+  * high: Wiki or ticketing system
+* Capability skill: anyone can do it (low) <-> Made for experts (high)
+  * low: transparent sidecar (eg vuln scanner)
+  * high: cli
+* Interface skill: anyone can do it (low) <-> needs specialized interface skills (high)
+  * low: Documentation in web aka wiki-style
+  * high: Code templates (a sample helm values.yaml or raw terraform provider)
+
+## Recap
+
+* You can use multiple interfaces for one capability
+* APIs (proverbial pig) are the most important interface b/c it can provide the baseline for all other interfaces
+* The beautification (lipstick) of the API through other interfaces makes uers happy
--- a/content/day1/03_beyond_platform_thinking.md
+++ b/content/day1/03_beyond_platform_thinking.md
@@ -0,0 +1,61 @@
+---
+title: beyond platform thinking at ritchie brothers
+weight: 3
+---
+
+The story of how Thoughtworks buit YY at Ritchie Bros (RB).
+Presented by the implementers at Thoughtworks (TW).
+
+## Backgroud
+
+* RB is a auctioneer in the field of heavy machinery
+* Problem: They are old(ish) and own a bunch of other companies -> Duplicate Solutions
+* Goals
+  * Get rid of duplicates
+  * Scale without the need of more personel
+
+### Platform creation principles
+
+* Platform is a product
+* Building is a exercise in software eng. not operations
+* Reduce dev friction
+
+## Platform overview
+
+* Platform provides selfservices
+* Teams manage everything inside their namespace themselfes
+* Multiple global locations that can be opted-in and -out
+
+## Principles and Solutions
+
+### Compliance at source of change
+
+> Developers own their pipelines
+
+* Dev teams are responsible for scanning, etc
+* Platform verifies thath the compliance scans have been done (through admission control)
+* Examples:
+  * OPA + Gatekeeper for admission -> Teams use snyk for scanning and admission checks the scan results
+  * ira as admission hook for approval -> PO approves in Jira, admission only acceps if webhook is approved
+
+### Platform Operators
+
+* Implemented: S3 Operator, IAM Operator, DynamoDB Operatopr
+* Reasons:
+  * Devs should not need access to AWS/GCP directly
+  * Teams have full control while not needing to submit tickets or write terraform
+* Goals
+  * Abstract specific details away
+  * Make the results cloud-portable (AWS, GCP, Azure)
+  * Still retain developer transparency
+* Example: DynamoDB Database
+  1. User: creates dynamo CR and ServiceRole CR
+  1. K8S: Create Pods, Secrets, Configs and Serviceaccount (related to a IAM Role)
+  1. User: Creates S3 Bucket CR and assignes ServiceRole
+  1. K8s: Injects secrets and configs where needed
+
+### Observability
+
+* Tool: Honeycomb
+* Metrics: Opentelemetry
+  * Operator reconcile steps are exposed as traces
--- a/content/day1/04_user_friendsly_devplatform.md
+++ b/content/day1/04_user_friendsly_devplatform.md
@@ -0,0 +1,21 @@
+---
+title: User friendly Developer Platforms 
+weight: 4
+---
+
+This talk was by a New York Times software developer.
+No real value
+
+
+## Baseline
+
+* How do we build composable components
+* Workflow of a new service: Create/Onboard -> Develop -> Build/Test/deploy (CI/CD) -> Run (Runtime/Cloud) -> Route (Ingress)
+
+
+## What do we need
+
+* User documentation
+* Adoption & Patnership
+* Platform as a Product
+* Customer feedback
--- a/content/day1/05_multitennancy.md
+++ b/content/day1/05_multitennancy.md
@@ -0,0 +1,38 @@
+---
+title: Multi Tannancy - Micro Clusters
+weight: 5
+---
+
+Part of the Multitannancy Con presented by Adobe
+
+## Challenges
+
+* Spin up Edge Infra globally fast
+
+## Implementation
+
+### First try - Single Tenant Cluster
+
+* Azure in Base - AWS on the edge
+* Single Tenant Clusters (Simpler Governance)
+* Responsibility is Shared between App and Platform (Monitoring, Ingress, etc)
+* Problem: Huge manual investment and overprovisioning
+* Result: Access Control to tenant Namespaces and Capacity Planning -> Pretty much a multi tenant cluster with one tenant per cluster
+
+### Second Try - Microcluster
+
+* One Cluster per Service
+
+### Third Try - Multitennancy
+
+* Use a bunch of components deployed by platform Team (Ingress, CD/CD, Monitoring, ...)
+* Harmonized general Runtime (cloud agnostic): Codenamed Ethos -> OVer 300 Clusters
+* Both shared clusters (shared by namespace) and dedicated clusters 
+* Cluster config is a basic json with name, capacity, teams
+* Capacity Managment get's Monitored using Prometheus
+* Cluster Changes should be non-desruptive -> K8S-Shredder
+* Cost efficiency: Use good PDBs and livelyness/readyness Probes alongside ressource requests and limits
+
+## Conclusion
+
+* There is a balance between cost, customization, setup and security between single-tenant und multi-tenant
--- a/content/day1/06_lightning_talks.md
+++ b/content/day1/06_lightning_talks.md
@@ -0,0 +1,45 @@
+---
+title: Lightning talks
+weight: 6
+---
+
+The lightning talks are 10-minute talks by diferent cncf projects.
+
+## Building contaienrs at scale using buildpacks
+
+A Project lightning talk by heroku and the cncf buildpacks.
+
+### How and why buildpacks?
+
+* What: A simple way to build reproducible contaienr images
+* Why: Scale, Reuse, Rebase
+* Rebase: Buildpacks are structured as layers
+  * Dependencies, app builds and the runtime are seperated -> Easy update
+* How: Use the PAck CLI `pack build <image>` `docker run <image>`
+
+## Konveyor
+
+A Platform for migration of legacy apps to cloudnative platforms.
+
+* Parts: Hub, Analysis (with langugage server), Assesment
+* Roadmap: Multi language support, GenAI, Asset Generation (e.g. Kube Deployments)
+
+## Argo'S Communuty Driven Development
+
+Pretty mutch a short intropduction to Argo Project
+
+* Project Parts: Workflows (CI), Events, CD, Rollouts
+* NPS: Net Promoter Score (How likely are you to recoomend this) -> Everyone loves argo (based on their survey)
+* Rollouts: Can be based with prometheus metrics
+
+## Flux
+
+* Components: Helm, Kustomize, Terrafrorm, ...
+* Flagger Now supports gateway api, prometheus, datadog and more
+* New Releases
+
+## A quick logg at the TAG App-Delivery
+
+* Mission: Everything related to cloud-native application delivery
+* Bi-Weekly Meetings
+* Subgroup: Platforms
--- a/content/day1/07_hitchhikers_guid_to_platform.md
+++ b/content/day1/07_hitchhikers_guid_to_platform.md
@@ -0,0 +1,63 @@
+---
+title: Hichhikers gzude
+weight: 7
+---
+
+This talks looks at bootstrapping Platforms using KSere.
+They do this in regards to AI Workflows.
+
+## Szenario
+
+* Deploy AI Workloads - Sometime consiting of different parts
+* Models get stored in a model registry
+
+## Baseline
+
+* Consistent APIs throughout the platform
+* Not the kube api directly b/c:
+  * Data scientists are a bit overpowered by the kube api
+  * Not only Kubernetes (also monitoring tools, feedback tools, etc)
+  * Better debugging experience for specific workloads
+
+## The debugging api
+
+* Specific API with enhanced statuses and consistent UX across Code and UI
+* Exampüle Endpoints: Pods, Deployments, InferenceServices
+* Provides a status summary-> Consistent health info across all related ressources
+  * Example: Deployments have progress/availability, Pods have phases, Containers have readyness -> What do we interpret how?
+  * Evaluation: Progressing, Available Count vs Readyness, Replicafailure, Pod Phase, Container Readyness
+* The rules themselfes may be pretty complex, but - since the user doesn't have to check them themselves - the status is simple
+
+### Debugging Metrics
+
+* Dashboards (Utilization, throughput, latency)
+* Events
+* Logs
+
+## Deployment API
+
+* Launchpad: Just select your model and version -> The DB (dock) stores all manifests (Spaceship)
+* Manifests relate to models from a model registry
+* Multi-tenancy is implemented using k8s namespaces
+* Kine is used to replace/extend etcd with the relational dock db -> Relation namespace<->manifests is stored here and RBAC can be used
+* Launchpad: Select Namespace and check resource (fuel) availability/utilization
+
+### Clsuter maintainance
+
+* Deplyoments can be launched to multiple clusters (even two clusters at once) -> HA through identical clusters
+* The excact same manifests get deployed to two clusters
+* Cluster desired state is stored externally to enable effortless upogrades, rescale, etc
+
+### Versioning API
+
+* Basicly the dock DB
+* CRDs are the representations of the inference manifests
+* Rollbacks, Promotion and History is managed via the CRs
+* Why not GitOps: Internal Diffs, deployment overrides, customized features
+
+### UX
+
+* User driven API design
+* Customized tools
+* Everything gets 1:1 replicated for HA
+* Large onboarding guide
--- a/content/day1/08_scaling_pg.md
+++ b/content/day1/08_scaling_pg.md
@@ -0,0 +1,66 @@
+---
+title: Scaling Postgres using CloudNativePG
+---
+
+A short Talk as Part of the DOK day - presendet by the VP of CloudNative at EDB (one of the biggest PG contributors)
+Stated target: Make the world your single point of failure
+
+## Proposal
+
+* Get rid of Vendor-Lockin using the oss projects PG, K8S and CnPG
+* PG was the DB of the year 2023 and a bunch of other times in the past
+* CnPG is a Level 5 mature operator
+
+## 4 Pillars
+
+* Seamless KubeAPI Integration (Operator PAttern)
+* Advanced observability (Prometheus Exporter, JSON logging)
+* Declarative Config (Deploy, Scale, Maintain)
+* Secure by default (Robust contaienrs, mTLS, and so on)
+
+## Clusters
+
+* Basic Ressource that defines name, instances, snyc and storage (and other params that have same defaults)
+* Implementation: Operator creates:
+  * The volumes (PG_Data, WAL (Write ahead log)
+  * Primary and Read-Write Service
+  * Replicas
+  * Read-Only Service (points at replicas)
+* Failover:
+  * Failure detected
+  * Stop R/W Service
+  * Promote Replica
+  * Activat R/W Service
+  * Kill old promary and demote to replica
+
+## Backup/Recovery
+
+* Continuos Backup: Write Ahead Log Backup to object store
+* Physical: Create from primary or standby to object store or kube volumes
+* Recovery: Copy full backup and apply WAL until target (last transactio or specific timestamp) is reached
+* Replica Cluster: Basicly recreates a new cluster to a full recovery but keeps the cluster in Read-Only Replica Mode
+* Planned: Backup Plugin Interface
+
+## Multi-Cluster
+
+* Just create a replica cluster via WAL-files from S3 on another kube cluster (lags 5 mins behind)
+* You can also activate replication streaming
+
+## Reccomended architecutre
+
+* Dev Cluster: 1 Instance without PDB and with Continuos backup
+* Prod: 3 Nodes with automatic failover and continuos backups
+* Symmetric: Two clusters
+  * Primary: 3-Node Cluster
+  * Secondary: WAL-Based 3-Node Cluster with a designated primary (to take over if primary cluster fails)
+* Symmetric Streaming: Same as Secondary, but you manually enable the streaming api for live replication
+* Cascading Replication: Scale Symmetric to more clusters
+* Single availability zone: Well, do your best to spread to nodes and aspire to streched kubernetes to more AZs
+
+## Roadmap
+
+* Replica Cluster (Symmetric) Switchover
+* Synchronous Symmetric
+* 3rd PArty Plugins
+* Manage DBs via the Operator
+* Storage Autoscaling
--- a/content/day1/09_serverless.md
+++ b/content/day1/09_serverless.md
@@ -0,0 +1,41 @@
+---
+title: The power of serverless with Knative, Crossplane, Dapr, Keda, Shipwright and friends
+---
+
+> When I say serverless I don't mean lambda - I mean serverless
+> That is thousands of lines of yaml - but I don't want to depress you
+> It will be eventually done
+> Imagine this error is not happening
+> Just imagine how I did this last night
+
+## Goal
+
+* Take my sourcecode and run it, scale it - jsut don't ask me
+
+## Baseline
+
+* Use Kubernetes for platform
+* Use kNative for autoscaling
+* Use Kaniko/Shipwright for building
+* Use Dupr for inter-service Communication
+
+## Openfunction
+
+> The glue between different tools to achive serverless
+
+* CRD that describes:
+  * Build this image and push it to the registry
+  * Use this builder to build my project
+  * This in my Repo
+  * My App listens on this port
+  * Annotations
+
+## Dependencies
+
+* Open Questions
+  * Where are the serverless servers -> Cluster, dependencies, secrets
+  * How do I create DBs, etc
+* Resulting needs
+  * Cluster aaS (using crossplane - in this case using aws)
+  * DBaaS (using crossplane - again usig pq on aws)
+  * App aaS
--- a/content/day1/_index.md
+++ b/content/day1/_index.md
@@ -0,0 +1,10 @@
+---
+archetype: chapter 
+title: Day 1
+weight: 1
+---
+
+Day one is the Day for co-located events aka CloudNativeCon.
+I spent most of the day attending the Platform Engineering Day - as one might have guessed it's all about platform engineering.
+
+Everything started with badge pickup - a very smooth experence (but that may be related to me showing up an hour or so too early).
--- a/content/lessons_learned/01_operators.md
+++ b/content/lessons_learned/01_operators.md
@@ -0,0 +1,7 @@
+---
+title: Operators
+---
+
+## Observability
+
+* Export reconcile loop steps as opentelemetry traces
--- a/content/lessons_learned/_index.md
+++ b/content/lessons_learned/_index.md
@@ -0,0 +1,7 @@
+---
+archetype: chapter 
+title: Lessons Learned
+weight: 99
+---
+
+Interesting lessons learned + tipps/tricks.