Compare commits
5 Commits
b515be2220
...
f8e654d6a5
Author | SHA1 | Date | |
---|---|---|---|
f8e654d6a5 | |||
9ee562e88d | |||
daf83861af | |||
7b1203c7a3 | |||
e2e3b2fdf3 |
119
.vscode/ltex.dictionary.en-US.txt
vendored
Normal file
119
.vscode/ltex.dictionary.en-US.txt
vendored
Normal file
|
@ -0,0 +1,119 @@
|
||||||
|
CloudNativeCon
|
||||||
|
Syntasso
|
||||||
|
OpenTelemetry
|
||||||
|
Multitannancy
|
||||||
|
Multitenancy
|
||||||
|
PDBs
|
||||||
|
Buildpacks
|
||||||
|
buildpacks
|
||||||
|
Konveyor
|
||||||
|
GenAI
|
||||||
|
Kube
|
||||||
|
Kustomize
|
||||||
|
KServe
|
||||||
|
kube
|
||||||
|
InferenceServices
|
||||||
|
Replicafailure
|
||||||
|
etcd
|
||||||
|
RBAC
|
||||||
|
CRDs
|
||||||
|
CRs
|
||||||
|
GitOps
|
||||||
|
CnPG
|
||||||
|
mTLS
|
||||||
|
WAL
|
||||||
|
AZs
|
||||||
|
DBs
|
||||||
|
kNative
|
||||||
|
Kaniko
|
||||||
|
Dupr
|
||||||
|
crossplane
|
||||||
|
DBaaS
|
||||||
|
APPaaS
|
||||||
|
CLUSTERaaS
|
||||||
|
OpsManager
|
||||||
|
multicluster
|
||||||
|
Statefulset
|
||||||
|
eBPF
|
||||||
|
Parca
|
||||||
|
KubeCon
|
||||||
|
FinOps
|
||||||
|
moondream
|
||||||
|
OLLAMA
|
||||||
|
LLVA
|
||||||
|
LLAVA
|
||||||
|
bokllava
|
||||||
|
NVLink
|
||||||
|
CUDA
|
||||||
|
Space-seperated
|
||||||
|
KAITO
|
||||||
|
Hugginface
|
||||||
|
LLMA
|
||||||
|
Alluxio
|
||||||
|
LLMs
|
||||||
|
onprem
|
||||||
|
Kube
|
||||||
|
Kubeflow
|
||||||
|
Ohly
|
||||||
|
distroless
|
||||||
|
init
|
||||||
|
Distroless
|
||||||
|
Buildkit
|
||||||
|
busybox
|
||||||
|
ECK
|
||||||
|
Kibana
|
||||||
|
Dedup
|
||||||
|
Crossplane
|
||||||
|
autoprovision
|
||||||
|
RBAC
|
||||||
|
Serviceaccount
|
||||||
|
CVEs
|
||||||
|
Podman
|
||||||
|
LinkerD
|
||||||
|
sidecarless
|
||||||
|
Kubeproxy
|
||||||
|
Daemonset
|
||||||
|
zTunnel
|
||||||
|
HBONE
|
||||||
|
Paketo
|
||||||
|
KORFI
|
||||||
|
Traefik
|
||||||
|
traefik
|
||||||
|
Vercel
|
||||||
|
Isovalent
|
||||||
|
CNIs
|
||||||
|
Ivanti
|
||||||
|
envs
|
||||||
|
CoreDNS
|
||||||
|
Istio
|
||||||
|
buildpacks
|
||||||
|
Buildpack
|
||||||
|
SBOM
|
||||||
|
Tekton
|
||||||
|
KPack
|
||||||
|
Multiarch
|
||||||
|
Tanzu
|
||||||
|
Kubebuilder
|
||||||
|
finalizer
|
||||||
|
OLM
|
||||||
|
depply
|
||||||
|
CatalogD
|
||||||
|
Rukoak
|
||||||
|
kapp
|
||||||
|
Depply
|
||||||
|
Jetstack
|
||||||
|
kube-lego
|
||||||
|
PKI-usecase
|
||||||
|
multimanager
|
||||||
|
kubebuider
|
||||||
|
kubebuilder
|
||||||
|
FluentD
|
||||||
|
FluentBit
|
||||||
|
OpenMetrics
|
||||||
|
upsert
|
||||||
|
tektone-based
|
||||||
|
ODIT.Services
|
||||||
|
Planetscale
|
||||||
|
vitess
|
||||||
|
Autupdate
|
||||||
|
KubeCon
|
3
.vscode/ltex.disabledRules.en-US.txt
vendored
Normal file
3
.vscode/ltex.disabledRules.en-US.txt
vendored
Normal file
|
@ -0,0 +1,3 @@
|
||||||
|
ARROWS
|
||||||
|
ARROWS
|
||||||
|
ARROWS
|
2
.vscode/ltex.hiddenFalsePositives.en-US.txt
vendored
Normal file
2
.vscode/ltex.hiddenFalsePositives.en-US.txt
vendored
Normal file
|
@ -0,0 +1,2 @@
|
||||||
|
{"rule":"MORFOLOGIK_RULE_EN_US","sentence":"^\\QJust create a replica cluster via WAL-files from S3 on another kube cluster (lags 5 mins behind)\nYou can also activate replication streaming\\E$"}
|
||||||
|
{"rule":"MORFOLOGIK_RULE_EN_US","sentence":"^\\QResulting needs\nCluster aaS (using crossplane - in this case using aws)\nDBaaS (using crossplane - again usig pq on aws)\nApp aaS\\E$"}
|
|
@ -9,7 +9,7 @@ This current version is probably full of typos - will fix later. This is what ty
|
||||||
|
|
||||||
## How did I get there?
|
## How did I get there?
|
||||||
|
|
||||||
I attended KubeCon + CloudNAtiveCon Europe 2024 as the one and only [ODIT.Services](https://odit.services) representative.
|
I attended KubeCon + CloudNativeCon Europe 2024 as the one and only [ODIT.Services](https://odit.services) representative.
|
||||||
|
|
||||||
## Style Guide
|
## Style Guide
|
||||||
|
|
||||||
|
|
|
@ -7,4 +7,4 @@ tags:
|
||||||
---
|
---
|
||||||
|
|
||||||
The first "event" of the day was - as always - the opening keynote.
|
The first "event" of the day was - as always - the opening keynote.
|
||||||
Today presented by Redhat and Syntasso.
|
Today presented by Red Hat and Syntasso.
|
||||||
|
|
|
@ -6,20 +6,19 @@ tags:
|
||||||
- dx
|
- dx
|
||||||
---
|
---
|
||||||
|
|
||||||
By VMware (of all people) - kinda funny that they chose this title with the wole Broadcom fun.
|
By VMware (of all people) - kinda funny that they chose this title with the whole Broadcom fun.
|
||||||
The main topic of this talk is: What interface do we choose for what capability.
|
The main topic of this talk is: What interface do we choose for what capability.
|
||||||
|
|
||||||
## Personas
|
## Personas
|
||||||
|
|
||||||
* Experts: Kubernetes, DB Engee
|
* Experts: Kubernetes, DB engineer
|
||||||
* Users: Employees that just want to do stuff
|
* Users: Employees that just want to do stuff
|
||||||
* Platform Engeneers: Connect Users to Services by Experts
|
* Platform engineers: Connect Users to Services by Experts
|
||||||
|
|
||||||
## Goal
|
## Goal
|
||||||
|
|
||||||
* Create Interfaces
|
* Create Interfaces: Connect Users to Services
|
||||||
* Interface: Connect Users to Services
|
* Problem: Many different types of Interfaces (SaaS, GUI, CLI) with different capabilities
|
||||||
* Problem: Many diferent types of Interfaces (SaaS, GUI, CLI) with different capabilities
|
|
||||||
|
|
||||||
## Dimensions
|
## Dimensions
|
||||||
|
|
||||||
|
@ -27,13 +26,13 @@ The main topic of this talk is: What interface do we choose for what capability.
|
||||||
|
|
||||||
* Autonomy: external dependency (low) <-> self-service (high)
|
* Autonomy: external dependency (low) <-> self-service (high)
|
||||||
* low: Ticket system -> But sometimes good for getting an expert
|
* low: Ticket system -> But sometimes good for getting an expert
|
||||||
* high: Portal -> Nice, but somethimes we just need a human contact
|
* high: Portal -> Nice, but sometimes we just need a human contact
|
||||||
* Contextual distance: stay in the same tool (low) <-> switch tools (high)
|
* Contextual distance: stay in the same tool (low) <-> switch tools (high)
|
||||||
* low: IDE plugin -> High potential friction if stuff goes wrong/complex (context switch needed)
|
* low: IDE plugin -> High potential friction if stuff goes wrong/complex (context switch needed)
|
||||||
* high: Wiki or ticketing system
|
* high: Wiki or ticketing system
|
||||||
* Capability skill: anyone can do it (low) <-> Made for experts (high)
|
* Capability skill: anyone can do it (low) <-> Made for experts (high)
|
||||||
* low: transparent sidecar (eg vuln scanner)
|
* low: transparent sidecar (e.g. vulnerability scanner)
|
||||||
* high: cli
|
* high: CLI
|
||||||
* Interface skill: anyone can do it (low) <-> needs specialized interface skills (high)
|
* Interface skill: anyone can do it (low) <-> needs specialized interface skills (high)
|
||||||
* low: Documentation in web aka wiki-style
|
* low: Documentation in web aka wiki-style
|
||||||
* high: Code templates (a sample helm values.yaml or raw terraform provider)
|
* high: Code templates (a sample helm values.yaml or raw terraform provider)
|
||||||
|
@ -42,4 +41,4 @@ The main topic of this talk is: What interface do we choose for what capability.
|
||||||
|
|
||||||
* You can use multiple interfaces for one capability
|
* You can use multiple interfaces for one capability
|
||||||
* APIs (proverbial pig) are the most important interface b/c it can provide the baseline for all other interfaces
|
* APIs (proverbial pig) are the most important interface b/c it can provide the baseline for all other interfaces
|
||||||
* The beautification (lipstick) of the API through other interfaces makes uers happy
|
* The beautification (lipstick) of the API through other interfaces makes users happy
|
||||||
|
|
|
@ -62,10 +62,10 @@ Presented by the implementers at Thoughtworks (TW).
|
||||||
### Observability
|
### Observability
|
||||||
|
|
||||||
* Tool: Honeycomb
|
* Tool: Honeycomb
|
||||||
* Metrics: Opentelemetry
|
* Metrics: OpenTelemetry
|
||||||
* Operator reconcile steps are exposed as traces
|
* Operator reconcile steps are exposed as traces
|
||||||
|
|
||||||
## Q&A
|
## Q&A
|
||||||
|
|
||||||
* Your teams are pretty autonomus -> What to do with more classic teams: Over a multi-year jurney every team settles on the ownership and selfservice approach
|
* Your teams are pretty autonomous -> What to do with more classic teams: Over a multi-year journey every team settles on the ownership and self-service approach
|
||||||
* How to teams get access to stages: They just get temselves a stage namespace, attach to ingress and have fun (admission handles the rest)
|
* How teams get access to stages: They just get themselves a stage namespace, attach to ingress and have fun (admission handles the rest)
|
||||||
|
|
|
@ -17,6 +17,6 @@ No real value
|
||||||
## What do we need
|
## What do we need
|
||||||
|
|
||||||
* User documentation
|
* User documentation
|
||||||
* Adoption & Patnership
|
* Adoption & Partnership
|
||||||
* Platform as a Product
|
* Platform as a Product
|
||||||
* Customer feedback
|
* Customer feedback
|
||||||
|
|
|
@ -10,7 +10,7 @@ tags:
|
||||||
- multicluster
|
- multicluster
|
||||||
---
|
---
|
||||||
|
|
||||||
Part of the Multitannancy Con presented by Adobe
|
Part of the Multi-tenancy Con presented by Adobe
|
||||||
|
|
||||||
## Challenges
|
## Challenges
|
||||||
|
|
||||||
|
@ -22,24 +22,24 @@ Part of the Multitannancy Con presented by Adobe
|
||||||
|
|
||||||
* Azure in Base - AWS on the edge
|
* Azure in Base - AWS on the edge
|
||||||
* Single Tenant Clusters (Simpler Governance)
|
* Single Tenant Clusters (Simpler Governance)
|
||||||
* Responsibility is Shared between App and Platform (Monitoring, Ingress, etc)
|
* Responsibility is Shared between App and Platform (Monitoring, Ingress, etc.)
|
||||||
* Problem: Huge manual investment and over provisioning
|
* Problem: Huge manual investment and over provisioning
|
||||||
* Result: Access Control to tenant Namespaces and Capacity Planning -> Pretty much a multi tenant cluster with one tenant per cluster
|
* Result: Access Control to tenant Namespaces and Capacity Planning -> Pretty much a multi tenant cluster with one tenant per cluster
|
||||||
|
|
||||||
### Second Try - Microcluster
|
### Second Try - Micro Clusters
|
||||||
|
|
||||||
* One Cluster per Service
|
* One Cluster per Service
|
||||||
|
|
||||||
### Third Try - Multitennancy
|
### Third Try - Multi-tenancy
|
||||||
|
|
||||||
* Use a bunch of components deployed by platform Team (Ingress, CD/CD, Monitoring, ...)
|
* Use a bunch of components deployed by platform Team (Ingress, CD/CD, Monitoring, ...)
|
||||||
* Harmonized general Runtime (cloud agnostic): Codenamed Ethos -> OVer 300 Clusters
|
* Harmonized general Runtime (cloud-agnostic): Code-named Ethos -> Over 300 Clusters
|
||||||
* Both shared clusters (shared by namespace) and dedicated clusters
|
* Both shared clusters (shared by namespace) and dedicated clusters
|
||||||
* Cluster config is a basic json with name, capacity, teams
|
* Cluster config is a basic JSON with name, capacity, teams
|
||||||
* Capacity Managment get's Monitored using Prometheus
|
* Capacity Management gets Monitored using Prometheus
|
||||||
* Cluster Changes should be non-desruptive -> K8S-Shredder
|
* Cluster Changes should be nondestructive -> K8S-Shredder
|
||||||
* Cost efficiency: Use good PDBs and livelyness/readyness Probes alongside ressource requests and limits
|
* Cost efficiency: Use good PDBs and liveliness/readiness Probes alongside resource requests and limits
|
||||||
|
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
* There is a balance between cost, customization, setup and security between single-tenant und multi-tenant
|
* There is a balance between cost, customization, setup and security between single-tenant and multi-tenant
|
||||||
|
|
|
@ -3,42 +3,41 @@ title: Lightning talks
|
||||||
weight: 6
|
weight: 6
|
||||||
---
|
---
|
||||||
|
|
||||||
The lightning talks are 10-minute talks by diferent cncf projects.
|
The lightning talks are 10-minute talks by different CNCF projects.
|
||||||
|
|
||||||
## Building contaienrs at scale using buildpacks
|
## Building containers at scale using buildpacks
|
||||||
|
|
||||||
A Project lightning talk by heroku and the cncf buildpacks.
|
A Project lightning talk by Heroku and the CNCF buildpacks.
|
||||||
|
|
||||||
### How and why buildpacks?
|
### How and why buildpacks?
|
||||||
|
|
||||||
* What: A simple way to build reproducible contaienr images
|
* What: A simple way to build reproducible container images
|
||||||
* Why: Scale, Reuse, Rebase
|
* Why: Scale, Reuse, Rebase: Buildpacks are structured as layers
|
||||||
* Rebase: Buildpacks are structured as layers
|
|
||||||
* Dependencies, app builds and the runtime are seperated -> Easy update
|
* Dependencies, app builds and the runtime are seperated -> Easy update
|
||||||
* How: Use the PAck CLI `pack build <image>` `docker run <image>`
|
* How: Use the Pack CLI `pack build <image>` `docker run <image>`
|
||||||
|
|
||||||
## Konveyor
|
## Konveyor
|
||||||
|
|
||||||
A Platform for migration of legacy apps to cloud native platforms.
|
A Platform for migration of legacy apps to cloud native platforms.
|
||||||
|
|
||||||
* Parts: Hub, Analysis (with langugage server), Assesment
|
* Parts: Hub, Analysis (with language server), assessment
|
||||||
* Roadmap: Multi language support, GenAI, Asset Generation (e.g. Kube Deployments)
|
* Roadmap: Multi language support, GenAI, Asset Generation (e.g. Kube Deployments)
|
||||||
|
|
||||||
## Argo'S Communuty Driven Development
|
## Argo's Community Driven Development
|
||||||
|
|
||||||
Pretty mutch a short intropduction to Argo Project
|
Pretty much a short introduction to Argo Project
|
||||||
|
|
||||||
* Project Parts: Workflows (CI), Events, CD, Rollouts
|
* Project Parts: Workflows (CI), Events, CD, Rollouts
|
||||||
* NPS: Net Promoter Score (How likely are you to recoomend this) -> Everyone loves argo (based on their survey)
|
* NPS: Net Promoter Score (How likely are you to recommend this) -> Everyone loves Argo (based on their survey)
|
||||||
* Rollouts: Can be based with prometheus metrics
|
* Rollouts: Can be based with Prometheus metrics
|
||||||
|
|
||||||
## Flux
|
## Flux
|
||||||
|
|
||||||
* Components: Helm, Kustomize, Terrafrorm, ...
|
* Components: Helm, Kustomize, Terraform, ...
|
||||||
* Flagger Now supports gateway api, prometheus, datadog and more
|
* Flagger Now supports gateway API, Prometheus, Datadog and more
|
||||||
* New Releases
|
* New Releases
|
||||||
|
|
||||||
## A quick logg at the TAG App-Delivery
|
## A quick look at the TAG App-Delivery
|
||||||
|
|
||||||
* Mission: Everything related to cloud-native application delivery
|
* Mission: Everything related to cloud-native application delivery
|
||||||
* Bi-Weekly Meetings
|
* Bi-Weekly Meetings
|
||||||
|
|
|
@ -8,30 +8,30 @@ tags:
|
||||||
- dx
|
- dx
|
||||||
---
|
---
|
||||||
|
|
||||||
This talks looks at bootstrapping Platforms using KSere.
|
This talk looks at bootstrapping Platforms using KServe.
|
||||||
They do this in regards to AI Workflows.
|
They do this in regard to AI Workflows.
|
||||||
|
|
||||||
## Szenario
|
## Scenario
|
||||||
|
|
||||||
* Deploy AI Workloads - Sometime consiting of different parts
|
* Deploy AI Workloads - Sometime consisting of different parts
|
||||||
* Models get stored in a model registry
|
* Models get stored in a model registry
|
||||||
|
|
||||||
## Baseline
|
## Baseline
|
||||||
|
|
||||||
* Consistent APIs throughout the platform
|
* Consistent APIs throughout the platform
|
||||||
* Not the kube api directly b/c:
|
* Not the kube API directly b/c:
|
||||||
* Data scientists are a bit overpowered by the kube api
|
* Data scientists are a bit overpowered by the kube API
|
||||||
* Not only Kubernetes (also monitoring tools, feedback tools, etc)
|
* Not only Kubernetes (also monitoring tools, feedback tools, etc.)
|
||||||
* Better debugging experience for specific workloads
|
* Better debugging experience for specific workloads
|
||||||
|
|
||||||
## The debugging api
|
## The debugging API
|
||||||
|
|
||||||
* Specific API with enhanced statuses and consistent UX across Code and UI
|
* Specific API with enhanced statuses and consistent UX across Code and UI
|
||||||
* Exampüle Endpoints: Pods, Deployments, InferenceServices
|
* Example Endpoints: Pods, Deployments, InferenceServices
|
||||||
* Provides a status summary-> Consistent health info across all related ressources
|
* Provides a status summary-> Consistent health info across all related resources
|
||||||
* Example: Deployments have progress/availability, Pods have phases, Containers have readyness -> What do we interpret how?
|
* Example: Deployments have progress/availability, Pods have phases, Containers have readiness -> What do we interpret how?
|
||||||
* Evaluation: Progressing, Available Count vs Readyness, Replicafailure, Pod Phase, Container Readyness
|
* Evaluation: Progressing, Available Count vs Readiness, Replicafailure, Pod Phase, Container Readiness
|
||||||
* The rules themselfes may be pretty complex, but - since the user doesn't have to check them themselves - the status is simple
|
* The rules themselves may be pretty complex, but - since the user doesn't have to check them themselves - the status is simple
|
||||||
|
|
||||||
### Debugging Metrics
|
### Debugging Metrics
|
||||||
|
|
||||||
|
@ -47,15 +47,15 @@ They do this in regards to AI Workflows.
|
||||||
* Kine is used to replace/extend etcd with the relational dock db -> Relation namespace<->manifests is stored here and RBAC can be used
|
* Kine is used to replace/extend etcd with the relational dock db -> Relation namespace<->manifests is stored here and RBAC can be used
|
||||||
* Launchpad: Select Namespace and check resource (fuel) availability/utilization
|
* Launchpad: Select Namespace and check resource (fuel) availability/utilization
|
||||||
|
|
||||||
### Clsuter maintainance
|
### Cluster maintenance
|
||||||
|
|
||||||
* Deplyoments can be launched to multiple clusters (even two clusters at once) -> HA through identical clusters
|
* Deployments can be launched to multiple clusters (even two clusters at once) -> HA through identical clusters
|
||||||
* The excact same manifests get deployed to two clusters
|
* The exact same manifests get deployed to two clusters
|
||||||
* Cluster desired state is stored externally to enable effortless upogrades, rescale, etc
|
* Cluster desired state is stored externally to enable effortless upgrades, rescale, etc
|
||||||
|
|
||||||
### Versioning API
|
### Versioning API
|
||||||
|
|
||||||
* Basicly the dock DB
|
* Basically the dock DB
|
||||||
* CRDs are the representations of the inference manifests
|
* CRDs are the representations of the inference manifests
|
||||||
* Rollbacks, Promotion and History is managed via the CRs
|
* Rollbacks, Promotion and History is managed via the CRs
|
||||||
* Why not GitOps: Internal Diffs, deployment overrides, customized features
|
* Why not GitOps: Internal Diffs, deployment overrides, customized features
|
||||||
|
|
|
@ -7,25 +7,25 @@ tags:
|
||||||
- db
|
- db
|
||||||
---
|
---
|
||||||
|
|
||||||
A short Talk as Part of the DOK day - presendet by the VP of CloudNative at EDB (one of the biggest PG contributors)
|
A short Talk as Part of the Data on Kubernetes day - presented by the VP of Cloud Native at EDB (one of the biggest PG contributors)
|
||||||
Stated target: Make the world your single point of failure
|
Stated target: Make the world your single point of failure
|
||||||
|
|
||||||
## Proposal
|
## Proposal
|
||||||
|
|
||||||
* Get rid of Vendor-Lockin using the oss projects PG, K8S and CnPG
|
* Get rid of Vendor-Lockin using the OSS projects PG, K8S and CnPG
|
||||||
* PG was the DB of the year 2023 and a bunch of other times in the past
|
* PG was the DB of the year 2023 and a bunch of other times in the past
|
||||||
* CnPG is a Level 5 mature operator
|
* CnPG is a Level 5 mature operator
|
||||||
|
|
||||||
## 4 Pillars
|
## 4 Pillars
|
||||||
|
|
||||||
* Seamless KubeAPI Integration (Operator PAttern)
|
* Seamless Kube API Integration (Operator Pattern)
|
||||||
* Advanced observability (Prometheus Exporter, JSON logging)
|
* Advanced observability (Prometheus Exporter, JSON logging)
|
||||||
* Declarative Config (Deploy, Scale, Maintain)
|
* Declarative Config (Deploy, Scale, Maintain)
|
||||||
* Secure by default (Robust contaienrs, mTLS, and so on)
|
* Secure by default (Robust containers, mTLS, and so on)
|
||||||
|
|
||||||
## Clusters
|
## Clusters
|
||||||
|
|
||||||
* Basic Ressource that defines name, instances, snyc and storage (and other params that have same defaults)
|
* Basic Resource that defines name, instances, sync and storage (and other parameters that have same defaults)
|
||||||
* Implementation: Operator creates:
|
* Implementation: Operator creates:
|
||||||
* The volumes (PG_Data, WAL (Write ahead log)
|
* The volumes (PG_Data, WAL (Write ahead log)
|
||||||
* Primary and Read-Write Service
|
* Primary and Read-Write Service
|
||||||
|
@ -35,15 +35,15 @@ Stated target: Make the world your single point of failure
|
||||||
* Failure detected
|
* Failure detected
|
||||||
* Stop R/W Service
|
* Stop R/W Service
|
||||||
* Promote Replica
|
* Promote Replica
|
||||||
* Activat R/W Service
|
* Activate R/W Service
|
||||||
* Kill old promary and demote to replica
|
* Kill old primary and demote to replica
|
||||||
|
|
||||||
## Backup/Recovery
|
## Backup/Recovery
|
||||||
|
|
||||||
* Continuos Backup: Write Ahead Log Backup to object store
|
* Continuous Backup: Write Ahead Log Backup to object store
|
||||||
* Physical: Create from primary or standby to object store or kube volumes
|
* Physical: Create from primary or standby to object store or kube volumes
|
||||||
* Recovery: Copy full backup and apply WAL until target (last transactio or specific timestamp) is reached
|
* Recovery: Copy full backup and apply WAL until target (last transaction or specific timestamp) is reached
|
||||||
* Replica Cluster: Basicly recreates a new cluster to a full recovery but keeps the cluster in Read-Only Replica Mode
|
* Replica Cluster: Basically recreates a new cluster to a full recovery but keeps the cluster in Read-Only Replica Mode
|
||||||
* Planned: Backup Plugin Interface
|
* Planned: Backup Plugin Interface
|
||||||
|
|
||||||
## Multi-Cluster
|
## Multi-Cluster
|
||||||
|
@ -51,21 +51,21 @@ Stated target: Make the world your single point of failure
|
||||||
* Just create a replica cluster via WAL-files from S3 on another kube cluster (lags 5 mins behind)
|
* Just create a replica cluster via WAL-files from S3 on another kube cluster (lags 5 mins behind)
|
||||||
* You can also activate replication streaming
|
* You can also activate replication streaming
|
||||||
|
|
||||||
## Reccomended architecutre
|
## Recommended architecture
|
||||||
|
|
||||||
* Dev Cluster: 1 Instance without PDB and with Continuos backup
|
* Dev Cluster: 1 Instance without PDB and with Continuous backup
|
||||||
* Prod: 3 Nodes with automatic failover and continuos backups
|
* Prod: 3 Nodes with automatic failover and continuous backups
|
||||||
* Symmetric: Two clusters
|
* Symmetric: Two clusters
|
||||||
* Primary: 3-Node Cluster
|
* Primary: 3-Node Cluster
|
||||||
* Secondary: WAL-Based 3-Node Cluster with a designated primary (to take over if primary cluster fails)
|
* Secondary: WAL based 3-Node Cluster with a designated primary (to take over if primary cluster fails)
|
||||||
* Symmetric Streaming: Same as Secondary, but you manually enable the streaming api for live replication
|
* Symmetric Streaming: Same as Secondary, but you manually enable the streaming API for live replication
|
||||||
* Cascading Replication: Scale Symmetric to more clusters
|
* Cascading Replication: Scale Symmetric to more clusters
|
||||||
* Single availability zone: Well, do your best to spread to nodes and aspire to streched kubernetes to more AZs
|
* Single availability zone: Well, do your best to spread to nodes and aspire to stretched Kubernetes to more AZs
|
||||||
|
|
||||||
## Roadmap
|
## Roadmap
|
||||||
|
|
||||||
* Replica Cluster (Symmetric) Switchover
|
* Replica Cluster (Symmetric) Switchover
|
||||||
* Synchronous Symmetric
|
* Synchronous Symmetric
|
||||||
* 3rd PArty Plugins
|
* 3rd Party Plugins
|
||||||
* Manage DBs via the Operator
|
* Manage DBs via the Operator
|
||||||
* Storage Autoscaling
|
* Storage Autoscaling
|
||||||
|
|
|
@ -4,14 +4,14 @@ weight: 9
|
||||||
---
|
---
|
||||||
|
|
||||||
> When I say serverless I don't mean lambda - I mean serverless
|
> When I say serverless I don't mean lambda - I mean serverless
|
||||||
> That is thousands of lines of yaml - but I don't want to depress you
|
> That is thousands of lines of YAML - but I don't want to depress you
|
||||||
> It will be eventually done
|
> It will be eventually done
|
||||||
> Imagine this error is not happening
|
> Imagine this error is not happening
|
||||||
> Just imagine how I did this last night
|
> Just imagine how I did this last night
|
||||||
|
|
||||||
## Goal
|
## Goal
|
||||||
|
|
||||||
* Take my sourcecode and run it, scale it - jsut don't ask me
|
* Take my source code and run it, scale it - just don't ask me
|
||||||
|
|
||||||
## Baseline
|
## Baseline
|
||||||
|
|
||||||
|
@ -22,7 +22,7 @@ weight: 9
|
||||||
|
|
||||||
## Open function
|
## Open function
|
||||||
|
|
||||||
> The glue between different tools to achive serverless
|
> The glue between different tools to achieve serverless
|
||||||
|
|
||||||
* CRD that describes:
|
* CRD that describes:
|
||||||
* Build this image and push it to the registry
|
* Build this image and push it to the registry
|
||||||
|
@ -35,8 +35,8 @@ weight: 9
|
||||||
|
|
||||||
* Open Questions
|
* Open Questions
|
||||||
* Where are the serverless servers -> Cluster, dependencies, secrets
|
* Where are the serverless servers -> Cluster, dependencies, secrets
|
||||||
* How do I create DBs, etc
|
* How do I create DBs, etc.
|
||||||
* Resulting needs
|
* Resulting needs
|
||||||
* Cluster aaS (using crossplane - in this case using aws)
|
* CLUSTERaaS (using crossplane - in this case using AWS)
|
||||||
* DBaaS (using crossplane - again usig pq on aws)
|
* DBaaS (using crossplane - again using pg on AWS)
|
||||||
* App aaS
|
* APPaaS
|
||||||
|
|
|
@ -14,21 +14,21 @@ Another talk as part of the Data On Kubernetes Day.
|
||||||
|
|
||||||
* Managed: Atlas
|
* Managed: Atlas
|
||||||
* Semi: Cloud manager
|
* Semi: Cloud manager
|
||||||
* Selfhosted: Enterprise and community operator
|
* Self-hosted: Enterprise and community operator
|
||||||
|
|
||||||
### Mongo on K8s
|
### MongoDB on K8s
|
||||||
|
|
||||||
* Cluster Architecture
|
* Cluster Architecture
|
||||||
* Control Plane: Operator
|
* Control Plane: Operator
|
||||||
* Data Plane: MongoDB Server + Agen (Sidecar Proxy)
|
* Data Plane: MongoDB Server + Agent (Sidecar Proxy)
|
||||||
* Enterprise Operator
|
* Enterprise Operator
|
||||||
* Opsmanager CR: Deploys 3-node operator DB and OpsManager
|
* OpsManager CR: Deploys 3-node operator DB and OpsManager
|
||||||
* MongoDB CR: The MongoDB cLusters (Compromised of agents)
|
* MongoDB CR: The MongoDB clusters (Compromised of agents)
|
||||||
* Advanced Usecase: Data Platform with mongodb on demand
|
* Advanced use case: Data Platform with MongoDB on demand
|
||||||
* Control Plane on one cluster (or on VMs/Hardmetal), data plane in tennant clusters
|
* Control Plane on one cluster (or on VMs/Bare-metal), data plane in tenant clusters
|
||||||
* Result: MongoDB CR can not relate to OpsManager CR directly
|
* Result: MongoDB CR can not relate to OpsManager CR directly
|
||||||
|
|
||||||
## Pitfalls
|
## Pitfalls
|
||||||
|
|
||||||
* Storage: Agnostic, Topology aware, configureable and resizeable (can't be done with statefulset)
|
* Storage: Agnostic, Topology aware, configurable and resizable (can't be done with Statefulset)
|
||||||
* Networking: Cluster-internal (Pod to Pod/Service), External (Split horizon over multicluster)
|
* Networking: Cluster-internal (Pod to Pod/Service), External (Split horizon over multicluster)
|
||||||
|
|
|
@ -9,8 +9,8 @@ tags:
|
||||||
|
|
||||||
## CNCF Platform maturity model
|
## CNCF Platform maturity model
|
||||||
|
|
||||||
* Was donated to the cncf by syntasso
|
* Was donated to the CNCF by Syntasso
|
||||||
* Constantly evolving since 1.0 in november 2023
|
* Constantly evolving since 1.0 in November 2023
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
|
|
||||||
|
@ -25,7 +25,7 @@ tags:
|
||||||
* Investment: How are funds/staff allocated to platform capabilities
|
* Investment: How are funds/staff allocated to platform capabilities
|
||||||
* Adoption: How and why do users discover this platform
|
* Adoption: How and why do users discover this platform
|
||||||
* Interfaces: How do users interact with and consume platform capabilities
|
* Interfaces: How do users interact with and consume platform capabilities
|
||||||
* Operations: How are platforms and capabilities planned, prioritzed, developed and maintained
|
* Operations: How are platforms and capabilities planned, prioritized, developed and maintained
|
||||||
* Measurement: What is the process for gathering and incorporating feedback/learning?
|
* Measurement: What is the process for gathering and incorporating feedback/learning?
|
||||||
|
|
||||||
## Goals
|
## Goals
|
||||||
|
@ -34,24 +34,24 @@ tags:
|
||||||
* Outcomes & Practices
|
* Outcomes & Practices
|
||||||
* Where are you at
|
* Where are you at
|
||||||
* Limits & Opportunities
|
* Limits & Opportunities
|
||||||
* Behaviours and outcome
|
* Behaviors and outcome
|
||||||
* Balance People and processes
|
* Balance People and processes
|
||||||
|
|
||||||
## Typical Journeys
|
## Typical Journeys
|
||||||
|
|
||||||
### Steps of the jurney
|
### Steps of the journey
|
||||||
|
|
||||||
1. What are your goals and limitations
|
1. What are your goals and limitations
|
||||||
2. What is my current landscape
|
2. What is my current landscape
|
||||||
3. Plan baby steps & iterate
|
3. Plan baby steps & iterate
|
||||||
|
|
||||||
### Szenarios
|
### Scenarios
|
||||||
|
|
||||||
* Bad: I want to improve my k8s platform
|
* Bad: I want to improve my k8s platform
|
||||||
* Good: Scaling an enterprise COE (Center Of Excellence)
|
* Good: Scaling an enterprise COE (Center Of Excellence)
|
||||||
* What: Onboard 20 Teams within 20 Months and enforce 8 security regulations
|
* What: Onboard 20 Teams within 20 Months and enforce 8 security regulations
|
||||||
* Where: We have a dedicated team of centrally funded people
|
* Where: We have a dedicated team of centrally funded people
|
||||||
* Lay the foundation: More funding for more larger teams -> Switch from Project to platform mindset
|
* Lay the foundation: More funding for more, larger teams -> Switch from Project to platform mindset
|
||||||
* Do your technical Due diligence in parallel
|
* Do your technical Due diligence in parallel
|
||||||
|
|
||||||
## Key Lessons
|
## Key Lessons
|
||||||
|
@ -60,8 +60,8 @@ tags:
|
||||||
* Know your landscape
|
* Know your landscape
|
||||||
* Plan in baby steps and iterate
|
* Plan in baby steps and iterate
|
||||||
* Lay the foundation for building the right thing and not just anything
|
* Lay the foundation for building the right thing and not just anything
|
||||||
* Dont forget to do your technical dd in parallel
|
* Don't forget to do your technical dd in parallel
|
||||||
|
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
* Majurity model is a helpful part but not the entire plan
|
* Maturity model is a helpful part but not the entire plan
|
||||||
|
|
|
@ -6,14 +6,14 @@ tags:
|
||||||
- network
|
- network
|
||||||
---
|
---
|
||||||
|
|
||||||
Held by Cilium regarding ebpf and hubble
|
Held by Cilium regarding eBPF and Hubble
|
||||||
|
|
||||||
## eBPF
|
## eBPF
|
||||||
|
|
||||||
> Extend the capabilities of the kernel without requiring to change the kernel source code or load modules
|
> Extend the capabilities of the kernel without requiring to change the kernel source code or load modules
|
||||||
|
|
||||||
* Benefits: Reduce performance overhead, gain deep visibility while being widely available
|
* Benefits: Reduce performance overhead, gain deep visibility while being widely available
|
||||||
* Example Tools: Parca (Profiling), Cilium (Networking), Hubble (Opservability), Tetragon (Security)
|
* Example Tools: Parca (Profiling), Cilium (Networking), Hubble (Observability), Tetragon (Security)
|
||||||
|
|
||||||
## Cilium
|
## Cilium
|
||||||
|
|
||||||
|
@ -27,22 +27,22 @@ Held by Cilium regarding ebpf and hubble
|
||||||
|
|
||||||
* CLI: TCP-Dump on steroids + API Client
|
* CLI: TCP-Dump on steroids + API Client
|
||||||
* UI: Graphical dependency and connectivity map
|
* UI: Graphical dependency and connectivity map
|
||||||
* Prometheus + Grafana + Opentelemetry compatible
|
* Prometheus + Grafana + OpenTelemetry compatible
|
||||||
* Metrics up to L7
|
* Metrics up to L7
|
||||||
|
|
||||||
### Where can it be used
|
### Where can it be used
|
||||||
|
|
||||||
* Service dependency with frequency
|
* Service dependency with frequency
|
||||||
* Kinds of http calls
|
* Kinds of HTTP calls
|
||||||
* Network Problems between L4 and L7 (including DNS)
|
* Network Problems between L4 and L7 (including DNS)
|
||||||
* Application Monitoring through status codes and latency
|
* Application Monitoring through status codes and latency
|
||||||
* Security-Related Network Blocks
|
* Security-Related Network Blocks
|
||||||
* Services accessed from outside the cluser
|
* Services accessed from outside the cluster
|
||||||
|
|
||||||
### Architecture
|
### Architecture
|
||||||
|
|
||||||
* Cilium Agent: Runs as the CNI für all Pods
|
* Cilium Agent: Runs as the CNI for all Pods
|
||||||
* Server: Runs on each node and retrieves the ebpf from cilium
|
* Server: Runs on each node and retrieves the eBPF from cilium
|
||||||
* Relay: Provide visibility throughout all nodes
|
* Relay: Provide visibility throughout all nodes
|
||||||
|
|
||||||
## TL;DR
|
## TL;DR
|
||||||
|
|
|
@ -7,10 +7,10 @@ weight: 1
|
||||||
Day one is the Day for co-located events aka CloudNativeCon.
|
Day one is the Day for co-located events aka CloudNativeCon.
|
||||||
I spent most of the day attending the Platform Engineering Day - as one might have guessed it's all about platform engineering.
|
I spent most of the day attending the Platform Engineering Day - as one might have guessed it's all about platform engineering.
|
||||||
|
|
||||||
Everything started with badge pickup - a very smooth experence (but that may be related to me showing up an hour or so too early).
|
Everything started with badge pickup - a very smooth experience (but that may be related to me showing up an hour or so too early).
|
||||||
|
|
||||||
## Talk reccomandations
|
## Talk recommendations
|
||||||
|
|
||||||
* Beyond Platform Thinking...
|
* Beyond Platform Thinking...
|
||||||
* Hitchhikers Guide to ...
|
* Hitchhiker's Guide to ...
|
||||||
* To K8S and beyond...
|
* To K8S and beyond...
|
||||||
|
|
|
@ -6,12 +6,12 @@ tags:
|
||||||
- opening
|
- opening
|
||||||
---
|
---
|
||||||
|
|
||||||
The opening keynote started - as is the tradition with keynotes - with an "motivational" opening video.
|
The opening keynote started - as is the tradition with keynotes - with a "motivational" opening video.
|
||||||
The keynote itself was presented by the CEO of the CNCF.
|
The keynote itself was presented by the CEO of the CNCF.
|
||||||
|
|
||||||
## The numbers
|
## The numbers
|
||||||
|
|
||||||
* Over 2000 attendees
|
* Over 12000 attendees
|
||||||
* 10 Years of Kubernetes
|
* 10 Years of Kubernetes
|
||||||
* 60% of large organizations expect rapid cost increases due to AI/ML (FinOps Survey)
|
* 60% of large organizations expect rapid cost increases due to AI/ML (FinOps Survey)
|
||||||
|
|
||||||
|
@ -26,10 +26,10 @@ The keynote itself was presented by the CEO of the CNCF.
|
||||||
## Live demo
|
## Live demo
|
||||||
|
|
||||||
* KIND cluster on desktop
|
* KIND cluster on desktop
|
||||||
* Protptype Stack (develop on client)
|
* Prototype Stack (develop on client)
|
||||||
* Kubernetes with the LLM
|
* Kubernetes with the LLM
|
||||||
* Host with LLVA (image describe model), moondream and OLLAMA (the model manager/registry()
|
* Host with LLAVA (image describe model), moondream and OLLAMA (the model manager/registry()
|
||||||
* Prod Stack (All in kube)
|
* Prod Stack (All in kube)
|
||||||
* Kubernetes with LLM, LLVA, OLLAMA, moondream
|
* Kubernetes with LLM, LLVA, OLLAMA, moondream
|
||||||
* Available Models: llava, mistral bokllava (llava*mistral)
|
* Available Models: LLAVA, mistral bokllava (LLAVA*mistral)
|
||||||
* Host takes picture, ai describes what is pictures (in our case the conference audience)
|
* Host takes picture, AI describes what is pictures (in our case the conference audience)
|
||||||
|
|
|
@ -7,7 +7,7 @@ tags:
|
||||||
- panel
|
- panel
|
||||||
---
|
---
|
||||||
|
|
||||||
A podium discussion (somewhat scripted) lead by Pryanka
|
A podium discussion (somewhat scripted) lead by Priyanka
|
||||||
|
|
||||||
## Guests
|
## Guests
|
||||||
|
|
||||||
|
@ -17,24 +17,24 @@ A podium discussion (somewhat scripted) lead by Pryanka
|
||||||
|
|
||||||
## Discussion
|
## Discussion
|
||||||
|
|
||||||
* What do you use as the base of dev for ollama
|
* What do you use as the base of dev for OLLAMA
|
||||||
* Jeff: The concepts from docker, git, kubernetes
|
* Jeff: The concepts from docker, git, Kubernetes
|
||||||
* How is the balance between ai engi and ai ops
|
* How is the balance between AI engineer and AI ops
|
||||||
* Jeff: The classic dev vs ops devide, many ML-Engi don't think about
|
* Jeff: The classic dev vs ops divide, many ML-Engineer don't think about
|
||||||
* Paige: Yessir
|
* Paige: Yessir
|
||||||
* How does infra keep up with the fast research
|
* How does infra keep up with the fast research
|
||||||
* Paige: Well, they don't - but they do their best and Cloud native is cool
|
* Paige: Well, they don't - but they do their best and Cloud native is cool
|
||||||
* Jeff: Well we're not google, but kubernetes is the saviour
|
* Jeff: Well we're not google, but Kubernetes is the savior
|
||||||
* What are scaling constraints
|
* What are scaling constraints
|
||||||
* Jeff: Currently sizing of models is still in it's infancy
|
* Jeff: Currently sizing of models is still in its infancy
|
||||||
* Jeff: There will be more specific hardware and someone will have to support it
|
* Jeff: There will be more specific hardware and someone will have to support it
|
||||||
* Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization)
|
* Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization)
|
||||||
* Paige: Optimization of smaller models
|
* Paige: Optimization of smaller models
|
||||||
* What technologies need to be open source licensed
|
* What technologies need to be open source licensed
|
||||||
* Jeff: The model b/c access and trust
|
* Jeff: The model b/c access and trust
|
||||||
* Tim: The models and base execution environemtn -> Vendor agnosticism
|
* Tim: The models and base execution environment -> Vendor agnosticism
|
||||||
* Paige: Yes and remixes are really imporant for development
|
* Paige: Yes and remixes are really important for development
|
||||||
* Anything else
|
* Anything else
|
||||||
* Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world
|
* Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world
|
||||||
* Paige: Currently many people just use paid apis to abstract the infra, but we need this stuff selfhostable
|
* Paige: Currently many people just use paid APIs to abstract the infra, but we need this stuff self-hostable
|
||||||
* Tim: I don'T want to know about the hardware, the whole infra side should be done by the cloudnative teams to let ML-Engi to just be ML-Engine
|
* Tim: I don't want to know about the hardware, the whole infra side should be done by the cloud native teams to let ML-Engineer to just be ML-Engine
|
||||||
|
|
|
@ -9,7 +9,7 @@ tags:
|
||||||
|
|
||||||
Kevin and Sanjay from NVIDIA
|
Kevin and Sanjay from NVIDIA
|
||||||
|
|
||||||
## Enabeling GPUs in Kubernetes today
|
## Enabling GPUs in Kubernetes today
|
||||||
|
|
||||||
* Host level components: Toolkit, drivers
|
* Host level components: Toolkit, drivers
|
||||||
* Kubernetes components: Device plugin, feature discovery, node selector
|
* Kubernetes components: Device plugin, feature discovery, node selector
|
||||||
|
@ -18,24 +18,24 @@ Kevin and Sanjay from NVIDIA
|
||||||
## GPU sharing
|
## GPU sharing
|
||||||
|
|
||||||
* Time slicing: Switch around by time
|
* Time slicing: Switch around by time
|
||||||
* Multi Process Service: Run allways on the GPU but share (space-)
|
* Multi Process Service: Always run on the GPU but share (space-)
|
||||||
* Multi Instance GPU: Space-seperated sharing on the hardware
|
* Multi Instance GPU: Space-seperated sharing on the hardware
|
||||||
* Virtual GPU: Virtualices Time slicing or MIG
|
* Virtual GPU: Virtualizes Time slicing or MIG
|
||||||
* CUDA Streams: Run multiple kernels in a single app
|
* CUDA Streams: Run multiple kernels in a single app
|
||||||
|
|
||||||
## Dynamic resource allocation
|
## Dynamic resource allocation
|
||||||
|
|
||||||
* A new alpha feature since Kube 1.26 for dynamic ressource requesting
|
* A new alpha feature since Kube 1.26 for dynamic resource requesting
|
||||||
* You just request a ressource via the API and have fun
|
* You just request a resource via the API and have fun
|
||||||
* The sharing itself is an implementation detail
|
* The sharing itself is an implementation detail
|
||||||
|
|
||||||
## GPU scale out challenges
|
## GPU scale-out challenges
|
||||||
|
|
||||||
* NVIDIA Picasso is a foundry for model creation powered by Kubernetes
|
* NVIDIA Picasso is a foundry for model creation powered by Kubernetes
|
||||||
* The workload is the training workload split into batches
|
* The workload is the training workload split into batches
|
||||||
* Challenge: Schedule multiple training jobs by different users that are prioritized
|
* Challenge: Schedule multiple training jobs by different users that are prioritized
|
||||||
|
|
||||||
### Topology aware placments
|
### Topology aware placements
|
||||||
|
|
||||||
* You need thousands of GPUs, a typical Node has 8 GPUs with fast NVLink communication - beyond that switching
|
* You need thousands of GPUs, a typical Node has 8 GPUs with fast NVLink communication - beyond that switching
|
||||||
* Target: optimize related jobs based on GPU node distance and NUMA placement
|
* Target: optimize related jobs based on GPU node distance and NUMA placement
|
||||||
|
@ -44,11 +44,11 @@ Kevin and Sanjay from NVIDIA
|
||||||
|
|
||||||
* Stuff can break, resulting in slowdowns or errors
|
* Stuff can break, resulting in slowdowns or errors
|
||||||
* Challenge: Detect faults and handle them
|
* Challenge: Detect faults and handle them
|
||||||
* Observability both in-band and out ouf band that expose node conditions in kubernetes
|
* Observability both in-band and out of band that expose node conditions in Kubernetes
|
||||||
* Needed: Automated fault-tolerant scheduling
|
* Needed: Automated fault-tolerant scheduling
|
||||||
|
|
||||||
### Multi-dimensional optimization
|
### Multidimensional optimization
|
||||||
|
|
||||||
* There are different KPIs: starvation, prioprity, occupanccy, fainrness
|
* There are different KPIs: starvation, priority, occupancy, fairness
|
||||||
* Challenge: What to choose (the multi-dimensional decision problemn)
|
* Challenge: What to choose (the multidimensional decision problem)
|
||||||
* Needed: A scheduler that can balance the dimensions
|
* Needed: A scheduler that can balance the dimensions
|
||||||
|
|
|
@ -20,6 +20,6 @@ Jorge Palma from Microsoft with a quick introduction.
|
||||||
* Kubernetes operator that interacts with
|
* Kubernetes operator that interacts with
|
||||||
* Node provisioner
|
* Node provisioner
|
||||||
* Deployment
|
* Deployment
|
||||||
* Simple CRD that decribes a model, infra and have fun
|
* Simple CRD that describes a model, infra and have fun
|
||||||
* Creates inferance endpoint
|
* Creates inference endpoint
|
||||||
* Models are currently 10 (Hugginface, LLMA, etc)
|
* Models are currently 10 (Hugginface, LLMA, etc.)
|
||||||
|
|
|
@ -6,14 +6,14 @@ tags:
|
||||||
- panel
|
- panel
|
||||||
---
|
---
|
||||||
|
|
||||||
A panel discussion with moderation by Google and participants from Google, Alluxio, Apmpere and CERN.
|
A panel discussion with moderation by Google and participants from Google, Alluxio, Ampere and CERN.
|
||||||
It was pretty scripted with prepared (sponsor specific) slides for each question answered.
|
It was pretty scripted with prepared (sponsor specific) slides for each question answered.
|
||||||
|
|
||||||
## Takeaways
|
## Takeaways
|
||||||
|
|
||||||
* Deploying a ML should become the new deploy a web app
|
* Deploying an ML should become the new deployment a web app
|
||||||
* The hardware should be fully utilized -> Better ressource sharing and scheduling
|
* The hardware should be fully utilized -> Better resource sharing and scheduling
|
||||||
* Smaller LLMs on cpu only is preyy cost efficient
|
* Smaller LLMs on CPU only is pretty cost-efficient
|
||||||
* Better scheduling by splitting into storage + cpu (prepare) and gpu (run) nodes to create a just-in-time flow
|
* Better scheduling by splitting into storage + CPU (prepare) and GPU (run) nodes to create a just-in-time flow
|
||||||
* Software acceleration is cool, but we should use more specialized hardware and models to run on CPUs
|
* Software acceleration is cool, but we should use more specialized hardware and models to run on CPUs
|
||||||
* We should be flexible regarding hardware, multi-cluster workloads and hybrig (onprem, burst to cloud) workloads
|
* We should be flexible regarding hardware, multi-cluster workloads and hybrid (onprem, burst to cloud) workloads
|
||||||
|
|
|
@ -5,21 +5,21 @@ tags:
|
||||||
- keynote
|
- keynote
|
||||||
---
|
---
|
||||||
|
|
||||||
Nikhita presented projects that merge CloudNative and AI.
|
Nikhita presented projects that merge cloud native and AI.
|
||||||
PAtrick Ohly Joined for DRA
|
Patrick Ohly Joined for DRA
|
||||||
|
|
||||||
### The "news"
|
### The "news"
|
||||||
|
|
||||||
* New work group AI
|
* New work group AI
|
||||||
* More tools are including ai features
|
* More tools are including AI features
|
||||||
* New updated cncf for children feat AI
|
* New updated CNCF for children feat AI
|
||||||
* One decade of Kubernetes
|
* One decade of Kubernetes
|
||||||
* DRA is in alpha
|
* DRA is in alpha
|
||||||
|
|
||||||
### DRA
|
### DRA
|
||||||
|
|
||||||
* A new API for resources (node-local and node-attached)
|
* A new API for resources (node-local and node-attached)
|
||||||
* Sharing of ressources between cods and containers
|
* Sharing of resources between cods and containers
|
||||||
* Vendor specific stuff are abstracted by a vendor driver controller
|
* Vendor specific stuff are abstracted by a vendor driver controller
|
||||||
* The kube scheduler can interact with the vendor parameters for scheduling and autoscaling
|
* The kube scheduler can interact with the vendor parameters for scheduling and autoscaling
|
||||||
|
|
||||||
|
@ -28,18 +28,18 @@ PAtrick Ohly Joined for DRA
|
||||||
* Kube is the seed for the AI infra plant
|
* Kube is the seed for the AI infra plant
|
||||||
* Kubeflow users wanted AI registries
|
* Kubeflow users wanted AI registries
|
||||||
* LLM on the edge
|
* LLM on the edge
|
||||||
* Opentelemetry bring semandtics
|
* OpenTelemetry bring semantics
|
||||||
* All of these tools form a symbiosis between
|
* All of these tools form a symbiosis between
|
||||||
* Topics of discussions
|
* Topics of discussions
|
||||||
|
|
||||||
### The working group AI
|
### The working group AI
|
||||||
|
|
||||||
* It was formed in october 2023
|
* It was formed in October 2023
|
||||||
* They are working on the whitepaper (cloudnative and ai) wich was opublished on 19.03.2024
|
* They are working on the white paper (cloud native and AI) which was published on 19.03.2024
|
||||||
* The landscape "cloudnative and ai" is WIP and will be merged into the main CNCF landscape
|
* The landscape "cloud native and AI" is WIP and will be merged into the main CNCF landscape
|
||||||
* The future focus will be on security and cost efficiency (with a hint of sustainability)
|
* The future focus will be on security and cost efficiency (with a hint of sustainability)
|
||||||
|
|
||||||
### LFAI and CNCF
|
### LFAI and CNCF
|
||||||
|
|
||||||
* The direcor of the AI foundation talks abouzt ai and cloudnative
|
* The director of the AI foundation talks about AI and cloud native
|
||||||
* They are looking forward to more colaboraion
|
* They are looking forward to more collaboration
|
||||||
|
|
|
@ -14,7 +14,7 @@ The entire talk was very short, but it was a nice demo of init containers
|
||||||
* Security is hard - distroless sounds like a nice helper
|
* Security is hard - distroless sounds like a nice helper
|
||||||
* Basic Challenge: Usability-Security Dilemma -> But more usability doesn't mean less secure, but more updating
|
* Basic Challenge: Usability-Security Dilemma -> But more usability doesn't mean less secure, but more updating
|
||||||
* Distro: Kernel + Software Packages + Package manager (optional) -> In Containers just without the kernel
|
* Distro: Kernel + Software Packages + Package manager (optional) -> In Containers just without the kernel
|
||||||
* Distroless: No package manager, no shell, no webcluent (curl/wget) - only minimal sofware bundels
|
* Distroless: No package manager, no shell, no web client (curl/wget) - only minimal software bundles
|
||||||
|
|
||||||
## Tools for distroless image creation
|
## Tools for distroless image creation
|
||||||
|
|
||||||
|
@ -29,13 +29,13 @@ The entire talk was very short, but it was a nice demo of init containers
|
||||||
|
|
||||||
## Demo
|
## Demo
|
||||||
|
|
||||||
* A (rough) distroless postgres with alpine build step and scratch final step
|
* A (rough) distroless Postgres with alpine build step and scratch final step
|
||||||
* A basic pg:alpine container used for init with a shared data volume
|
* A basic pg:alpine container used for init with a shared data volume
|
||||||
* The init uses the pg admin user to initialize the pg server (you don't need the admin creds after this)
|
* The init uses the pg admin user to initialize the pg server (you don't need the admin credentials after this)
|
||||||
|
|
||||||
### Kube
|
### Kube
|
||||||
|
|
||||||
* K apply failed b/c no internet, but was fixed by connecting to wifi
|
* K apply failed b/c no internet, but was fixed by connecting to Wi-Fi
|
||||||
* Without the init container the pod just crashes, with the init container the correct config gets created
|
* Without the init container the pod just crashes, with the init container the correct config gets created
|
||||||
|
|
||||||
### Docker compose
|
### Docker compose
|
||||||
|
|
|
@ -13,63 +13,63 @@ A talk by elastic.
|
||||||
|
|
||||||
## About elastic
|
## About elastic
|
||||||
|
|
||||||
* Elestic cloud as a managed service
|
* Elastic cloud as a managed service
|
||||||
* Deployed across AWS/GCP/Azure in over 50 regions
|
* Deployed across AWS/GCP/Azure in over 50 regions
|
||||||
* 600.000+ Containers
|
* 600000+ Containers
|
||||||
|
|
||||||
### Elastic and Kube
|
### Elastic and Kube
|
||||||
|
|
||||||
* They offer elastic obervability
|
* They offer elastic observability
|
||||||
* They offer the ECK operator for simplified deployments
|
* They offer the ECK operator for simplified deployments
|
||||||
|
|
||||||
## The baseline
|
## The baseline
|
||||||
|
|
||||||
* Goal: A large scale (1M+ containers resilient platform on k8s
|
* Goal: A large scale (1M+ containers) resilient platform on k8s
|
||||||
* Architecture
|
* Architecture
|
||||||
* Global Control: The control plane (api) for users with controllers
|
* Global Control: The control plane (API) for users with controllers
|
||||||
* Regional Apps: The "shitload" of kubernetes clusters where the actual customer services live
|
* Regional Apps: The "shitload" of Kubernetes clusters where the actual customer services live
|
||||||
|
|
||||||
## Scalability
|
## Scalability
|
||||||
|
|
||||||
* Challenge: How large can our cluster be, how many clusters do we need
|
* Challenge: How large can our cluster be, how many clusters do we need
|
||||||
* Problem: Only basic guidelines exist for that
|
* Problem: Only basic guidelines exist for that
|
||||||
* Decision: Horizontaly scale the number of clusters (5ßß-1K nodes each)
|
* Decision: Horizontally scale the number of clusters (5ßß-1K nodes each)
|
||||||
* Decision: Disposable clusters
|
* Decision: Disposable clusters
|
||||||
* Throw away without data loss
|
* Throw away without data loss
|
||||||
* Single source of throuth is not cluster etcd but external -> No etcd backups needed
|
* Single source of truth is not cluster etcd but external -> No etcd backups needed
|
||||||
* Everything can be recreated any time
|
* Everything can be recreated any time
|
||||||
|
|
||||||
## Controllers
|
## Controllers
|
||||||
|
|
||||||
{{% notice style="note" %}}
|
{{% notice style="note" %}}
|
||||||
I won't copy the explanations of operators/controllers in this notes
|
I won't copy the explanations of operators/controllers in these notes
|
||||||
{{% /notice %}}
|
{{% /notice %}}
|
||||||
|
|
||||||
* Many different controllers, including (but not limited to)
|
* Many controllers, including (but not limited to)
|
||||||
* cluster controler: Register cluster to controller
|
* cluster controller: Register cluster to controller
|
||||||
* Project controller: Schedule user's project to cluster
|
* Project controller: Schedule user's project to cluster
|
||||||
* Product controllers (Elasticsearch, Kibana, etc.)
|
* Product controllers (Elasticsearch, Kibana, etc.)
|
||||||
* Ingress/Cert manager
|
* Ingress/Cert manager
|
||||||
* Sometimes controllers depend on controllers -> potential complexity
|
* Sometimes controllers depend on controllers -> potential complexity
|
||||||
* Pro:
|
* Pro:
|
||||||
* Resilient (Selfhealing)
|
* Resilient (Self-healing)
|
||||||
* Level triggered (desired state vs procedure triggered)
|
* Level triggered (desired state vs procedure triggered)
|
||||||
* Simple reasoning when comparing desired state vs state machine
|
* Simple reasoning when comparing desired state vs state machine
|
||||||
* Official controller runtime lib
|
* Official controller runtime lib
|
||||||
* Workque: Automatic Dedup, Retry backoff and so on
|
* Workqueue: Automatic Dedup, Retry back off and so on
|
||||||
|
|
||||||
## Global Controllers
|
## Global Controllers
|
||||||
|
|
||||||
* Basic operation
|
* Basic operation
|
||||||
* Uses project config from Elastic cloud as the desired state
|
* Uses project config from Elastic cloud as the desired state
|
||||||
* The actual state is a k9s ressource in another cluster
|
* The actual state is a k9s resource in another cluster
|
||||||
* Challenge: Where is the source of thruth if the data is not stored in etc
|
* Challenge: Where is the source of truth if the data is not stored in etcd
|
||||||
* Solution: External datastore (postgres)
|
* Solution: External data store (Postgres)
|
||||||
* Challenge: How do we sync the db sources to kubernetes
|
* Challenge: How do we sync the db sources to Kubernetes
|
||||||
* Potential solutions: Replace etcd with the external db
|
* Potential solutions: Replace etcd with the external db
|
||||||
* Chosen solution:
|
* Chosen solution:
|
||||||
* The controllers don't use CRDs for storage, but they expose a webapi
|
* The controllers don't use CRDs for storage, but they expose a web-API
|
||||||
* Reconciliation still now interacts with the external db and go channels (que) instead
|
* Reconciliation still now interacts with the external db and go channels (queue) instead
|
||||||
* Then the CRs for the operators get created by the global controller
|
* Then the CRs for the operators get created by the global controller
|
||||||
|
|
||||||
### Large scale
|
### Large scale
|
||||||
|
@ -82,10 +82,10 @@ I won't copy the explanations of operators/controllers in this notes
|
||||||
### Reconcile
|
### Reconcile
|
||||||
|
|
||||||
* User-driven events are processed asap
|
* User-driven events are processed asap
|
||||||
* reconcole of everything should happen, bus with low prio slowly in the background
|
* reconcile of everything should happen, bus with low priority slowly in the background
|
||||||
* Solution: Status: LastReconciledRevision (timestamp) get's compare to revision, if larger -> User change
|
* Solution: Status: LastReconciledRevision (timestamp) gets compare to revision, if larger -> User change
|
||||||
* Prioritization: Just a custom event handler with the normal queue and a low prio
|
* Prioritization: Just a custom event handler with the normal queue and a low priority
|
||||||
* Low Prio Queue: Just a queue that adds items to the normal work-queue with a rate limit
|
* Queue: Just a queue that adds items to the normal work-queue with a rate limit
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
flowchart LR
|
flowchart LR
|
||||||
|
|
|
@ -6,28 +6,28 @@ tags:
|
||||||
- security
|
- security
|
||||||
---
|
---
|
||||||
|
|
||||||
A talk by Google and Microsoft with the premise of bether auth in k8s.
|
A talk by Google and Microsoft with the premise of better auth in k8s.
|
||||||
|
|
||||||
## Baselines
|
## Baselines
|
||||||
|
|
||||||
* Most access controllers have read access to all secrets -> They are not really designed for keeping these secrets
|
* Most access controllers have read access to all secrets -> They are not really designed for keeping these secrets
|
||||||
* Result: CVEs
|
* Result: CVEs
|
||||||
* Example: Just use ingress, nginx, put in some lua code in the config and voila: Service account token
|
* Example: Just use ingress, nginx, put in some Lua code in the config and e voilà: Service account token
|
||||||
* Fix: No more fun
|
* Fix: No more fun
|
||||||
|
|
||||||
## Basic solutions
|
## Basic solutions
|
||||||
|
|
||||||
* Seperate Control (the controller) from data (the ingress)
|
* Separate Control (the controller) from data (the ingress)
|
||||||
* Namespace limited ingress
|
* Namespace limited ingress
|
||||||
|
|
||||||
## Current state of cross namespace stuff
|
## Current state of cross namespace stuff
|
||||||
|
|
||||||
* Why: Reference tls cert for gateway api in the cert team'snamespace
|
* Why: Reference TLS cert for gateway API in the cert team's namespace
|
||||||
* Why: Move all ingress configs to one namespace
|
* Why: Move all ingress configs to one namespace
|
||||||
* Classic Solution: Annotations in contour that references a namespace that contains all certs (rewrites secret to certs/secret)
|
* Classic Solution: Annotations in contour that references a namespace that contains all certs (rewrites secret to certs/secret)
|
||||||
* Gateway Solution:
|
* Gateway Solution:
|
||||||
* Gateway TLS secret ref includes a namespace
|
* Gateway TLS secret ref includes a namespace
|
||||||
* ReferenceGrant pretty mutch allows referencing from X (Gatway) to Y (Secret)
|
* ReferenceGrant pretty much allows referencing from X (Gateway) to Y (Secret)
|
||||||
* Limits:
|
* Limits:
|
||||||
* Has to be implemented via controllers
|
* Has to be implemented via controllers
|
||||||
* The controllers still have read all - they just check if they are supposed to do this
|
* The controllers still have read all - they just check if they are supposed to do this
|
||||||
|
@ -36,9 +36,9 @@ A talk by Google and Microsoft with the premise of bether auth in k8s.
|
||||||
|
|
||||||
### Global
|
### Global
|
||||||
|
|
||||||
* Grant access to controller to only ressources relevant for them (using references and maybe class segmentation)
|
* Grant access to controller to only resources relevant for them (using references and maybe class segmentation)
|
||||||
* Allow for safe cross namespace references
|
* Allow for safe cross namespace references
|
||||||
* Make it easy for api devs to adopt it
|
* Make it easy for API devs to adopt it
|
||||||
|
|
||||||
### Personas
|
### Personas
|
||||||
|
|
||||||
|
@ -50,20 +50,20 @@ A talk by Google and Microsoft with the premise of bether auth in k8s.
|
||||||
|
|
||||||
* Alex: Define relationships via ReferencePatterns
|
* Alex: Define relationships via ReferencePatterns
|
||||||
* Kai: Specify controller identity (Serviceaccount), define relationship API
|
* Kai: Specify controller identity (Serviceaccount), define relationship API
|
||||||
* Rohan: Define cross namespace references (aka ressource grants that allow access to their ressources)
|
* Rohan: Define cross namespace references (aka resource grants that allow access to their resources)
|
||||||
|
|
||||||
## Result of the paper
|
## Result of the paper
|
||||||
|
|
||||||
### Architecture
|
### Architecture
|
||||||
|
|
||||||
* ReferencePattern: Where do i find the references -> example: GatewayClass in the gateway API
|
* ReferencePattern: Where do i find the references -> example: GatewayClass in the gateway API
|
||||||
* ReferenceConsumer: Who (IOdentity) has access under which conditions?
|
* ReferenceConsumer: Who (Identity) has access under which conditions?
|
||||||
* ReferenceGrant: Allow specific references
|
* ReferenceGrant: Allow specific references
|
||||||
|
|
||||||
### POC
|
### POC
|
||||||
|
|
||||||
* Minimum access: You only get access if the grant is there AND the reference actually exists
|
* Minimum access: You only get access if the grant is there AND the reference actually exists
|
||||||
* Their basic implementation works with the kube api
|
* Their basic implementation works with the kube API
|
||||||
|
|
||||||
### Open questions
|
### Open questions
|
||||||
|
|
||||||
|
@ -74,9 +74,9 @@ A talk by Google and Microsoft with the premise of bether auth in k8s.
|
||||||
|
|
||||||
## Alternative
|
## Alternative
|
||||||
|
|
||||||
* Idea: Just extend RBAC Roles with a selector (match labels, etc)
|
* Idea: Just extend RBAC Roles with a selector (match labels, etc.)
|
||||||
* Problems:
|
* Problems:
|
||||||
* Requires changes to kubernetes core auth
|
* Requires changes to Kubernetes core auth
|
||||||
* Everything bus list and watch is a pain
|
* Everything bus list and watch is a pain
|
||||||
* How do you handle AND vs OR selection
|
* How do you handle AND vs OR selection
|
||||||
* Field selectors: They exist
|
* Field selectors: They exist
|
||||||
|
@ -84,5 +84,5 @@ A talk by Google and Microsoft with the premise of bether auth in k8s.
|
||||||
|
|
||||||
## Meanwhile
|
## Meanwhile
|
||||||
|
|
||||||
* Prefer tools that support isolatiobn between controller and dataplane
|
* Prefer tools that support isolation between controller and data-plane
|
||||||
* Disable all non-needed features -> Especially scripting
|
* Disable all non-needed features -> Especially scripting
|
||||||
|
|
|
@ -7,31 +7,31 @@ tags:
|
||||||
---
|
---
|
||||||
|
|
||||||
A talk by UX and software people at Red Hat (Podman team).
|
A talk by UX and software people at Red Hat (Podman team).
|
||||||
The talk mainly followed the academic study process (aka this is the survey I did for my bachelors/masters thesis).
|
The talk mainly followed the academic study process (aka this is the survey I did for my bachelor's/master's thesis).
|
||||||
|
|
||||||
## Research
|
## Research
|
||||||
|
|
||||||
* User research Study including 11 devs and platform engineers over three months
|
* User research Study including 11 devs and platform engineers over three months
|
||||||
* Focus was on an new podman desktop feature
|
* Focus was on a new Podman desktop feature
|
||||||
* Experence range 2-3 years experience average (low no experience, high oldschool kube)
|
* Experience range 2-3 years experience average (low no experience, high old school kube)
|
||||||
* 16 questions regarding environment, workflow, debugging and pain points
|
* 16 questions regarding environment, workflow, debugging and pain points
|
||||||
* Analysis: Affinity mapping
|
* Analysis: Affinity mapping
|
||||||
|
|
||||||
## Findings
|
## Findings
|
||||||
|
|
||||||
* Where do I start when things are broken? -> There may be solutions, but devs don't know about them
|
* Where do I start when things are broken? -> There may be solutions, but devs don't know about them
|
||||||
* Network debugging is hard b/c many layers and problems occuring in between cni and infra are really hard -> Network topology issues are rare but hard
|
* Network debugging is hard b/c many layers and problems occurring in between CNI and infra are really hard -> Network topology issues are rare but hard
|
||||||
* YAML identation -> Tool support is needed for visualisation
|
* YAML indentation -> Tool support is needed for visualization
|
||||||
* YAML validation -> Just use validation in dev and gitops
|
* YAML validation -> Just use validation in dev and GitOps
|
||||||
* YAML Cleanup -> Normalize YAML (order, anchors, etc) for easy diff
|
* YAML Cleanup -> Normalize YAML (order, anchors, etc.) for easy diff
|
||||||
* Inadequate security analysis (too verbose, non-issues are warnings) -> Realtime insights (and during dev)
|
* Inadequate security analysis (too verbose, non-issues are warnings) -> Real-time insights (and during dev)
|
||||||
* Crash Loop -> Identify stuck containers, simple debug containers
|
* Crash Loop -> Identify stuck containers, simple debug containers
|
||||||
* CLI vs GUI -> Enable eperience level oriented gui, Enhance intime troubleshooting
|
* CLI vs GUI -> Enable experience level oriented GUI, Enhance in-time troubleshooting
|
||||||
|
|
||||||
## General issues
|
## General issues
|
||||||
|
|
||||||
* No direct fs access
|
* No direct fs access
|
||||||
* Multiple kubeconfigs
|
* Multiple kubeconfigs
|
||||||
* SaaS is sometimes only provided on kube, which sounds like complexity
|
* SaaS is sometimes only provided on kube, which sounds like complexity
|
||||||
* Where do i begin my troubleshooting
|
* Where do I begin my troubleshooting
|
||||||
* Interoperability/Fragility with updates
|
* Interoperability/Fragility with updates
|
||||||
|
|
|
@ -10,7 +10,7 @@ Global field CTO at Solo.io with a hint of servicemesh background.
|
||||||
|
|
||||||
## History
|
## History
|
||||||
|
|
||||||
* LinkerD 1.X was the first moder servicemesh and basicly a opt-in serviceproxy
|
* LinkerD 1.X was the first modern service mesh and basically an opt-in service proxy
|
||||||
* Challenges: JVM (size), latencies, ...
|
* Challenges: JVM (size), latencies, ...
|
||||||
|
|
||||||
### Why not node-proxy?
|
### Why not node-proxy?
|
||||||
|
@ -23,8 +23,8 @@ Global field CTO at Solo.io with a hint of servicemesh background.
|
||||||
### Why sidecar?
|
### Why sidecar?
|
||||||
|
|
||||||
* Transparent (ish)
|
* Transparent (ish)
|
||||||
* PArt of app lifecycle (up/down)
|
* Part of app lifecycle (up/down)
|
||||||
* Single tennant
|
* Single tenant
|
||||||
* No noisy neighbor
|
* No noisy neighbor
|
||||||
|
|
||||||
### Sidecar drawbacks
|
### Sidecar drawbacks
|
||||||
|
@ -46,7 +46,7 @@ Global field CTO at Solo.io with a hint of servicemesh background.
|
||||||
|
|
||||||
* Full transparency
|
* Full transparency
|
||||||
* Optimized networking
|
* Optimized networking
|
||||||
* Lower ressource allocation
|
* Lower resource allocation
|
||||||
* No race conditions
|
* No race conditions
|
||||||
* No manual pod injection
|
* No manual pod injection
|
||||||
* No credentials in the app
|
* No credentials in the app
|
||||||
|
@ -68,12 +68,12 @@ Global field CTO at Solo.io with a hint of servicemesh background.
|
||||||
* Kubeproxy replacement
|
* Kubeproxy replacement
|
||||||
* Ingress (via Gateway API)
|
* Ingress (via Gateway API)
|
||||||
* Mutual Authentication
|
* Mutual Authentication
|
||||||
* Specialiced CiliumNetworkPolicy
|
* Specialized CiliumNetworkPolicy
|
||||||
* Configure Envoy throgh Cilium
|
* Configure Envoy through Cilium
|
||||||
|
|
||||||
### Control Plane
|
### Control Plane
|
||||||
|
|
||||||
* Cilium-Agent on each node that reacts to scheduled workloads by programming the local dataplane
|
* Cilium-Agent on each node that reacts to scheduled workloads by programming the local data-plane
|
||||||
* API via Gateway API and CiliumNetworkPolicy
|
* API via Gateway API and CiliumNetworkPolicy
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
@ -98,29 +98,29 @@ flowchart TD
|
||||||
### Data plane
|
### Data plane
|
||||||
|
|
||||||
* Configured by control plane
|
* Configured by control plane
|
||||||
* Does all of the eBPF things in L4
|
* Does all the eBPF things in L4
|
||||||
* Does all of the envoy things in L7
|
* Does all the envoy things in L7
|
||||||
* In-Kernel Wireguard for optional transparent encryption
|
* In-Kernel WireGuard for optional transparent encryption
|
||||||
|
|
||||||
### mTLS
|
### mTLS
|
||||||
|
|
||||||
* Network Policies get applied at the eBPF layer (check if id a can talk to id 2)
|
* Network Policies get applied at the eBPF layer (check if ID a can talk to ID 2)
|
||||||
* When mTLS is enabled there is a auth check in advance -> It it fails, proceed with agents
|
* When mTLS is enabled there is an auth check in advance -> If it fails, proceed with agents
|
||||||
* Agents talk to each other for mTLS Auth and save the result to a cache -> Now ebpf can say yes
|
* Talk to each other for mTLS Auth and save the result to a cache -> Now eBPF can say yes
|
||||||
* Problems: The caches can lead to id confusion
|
* Problems: The caches can lead to ID confusion
|
||||||
|
|
||||||
## Istio
|
## Istio
|
||||||
|
|
||||||
### Basiscs
|
### Basics
|
||||||
|
|
||||||
* L4/7 Service mesh without it's own CNI
|
* L4/7 Service mesh without its own CNI
|
||||||
* Based on envoy
|
* Based on envoy
|
||||||
* mTLS
|
* mTLS
|
||||||
* Classicly via sidecar, nowadays
|
* Classically via sidecar, nowadays
|
||||||
|
|
||||||
### Ambient mode
|
### Ambient mode
|
||||||
|
|
||||||
* Seperate L4 and L7 -> Can run on cilium
|
* Separate L4 and L7 -> Can run on cilium
|
||||||
* mTLS
|
* mTLS
|
||||||
* Gateway API
|
* Gateway API
|
||||||
|
|
||||||
|
@ -143,14 +143,14 @@ flowchart TD
|
||||||
```
|
```
|
||||||
|
|
||||||
* Central xDS Control Plane
|
* Central xDS Control Plane
|
||||||
* Per-Node Dataplane that reads updates from Control Plane
|
* Per-Node Data-plane that reads updates from Control Plane
|
||||||
|
|
||||||
### Data Plane
|
### Data Plane
|
||||||
|
|
||||||
* L4 runs via zTunnel Daemonset that handels mTLS
|
* L4 runs via zTunnel Daemonset that handles mTLS
|
||||||
* The zTunnel traffic get's handed over to the CNI
|
* The zTunnel traffic gets handed over to the CNI
|
||||||
* L7 Proxy lives somewhere™ and traffic get's routed through it as an "extra hop" aka waypoint
|
* L7 Proxy lives somewhere™ and traffic gets routed through it as an "extra hop" aka waypoint
|
||||||
|
|
||||||
### mTLS
|
### mTLS
|
||||||
|
|
||||||
* The zTunnel creates a HBONE (http overlay network) tunnel with mTLS
|
* The zTunnel creates a HBONE (HTTP overlay network) tunnel with mTLS
|
||||||
|
|
|
@ -8,7 +8,7 @@ Who have I talked to today, are there any follow-ups or learnings?
|
||||||
## Operator Framework
|
## Operator Framework
|
||||||
|
|
||||||
* We talked about the operator lifecycle manager
|
* We talked about the operator lifecycle manager
|
||||||
* They shared the roadmap and the new release 1.0 will bring support for Operator Bundle loading from any oci source (no more public-registry enforcement)
|
* They shared the roadmap and the new release 1.0 will bring support for Operator Bundle loading from any OCI source (no more public-registry enforcement)
|
||||||
|
|
||||||
## Flux
|
## Flux
|
||||||
|
|
||||||
|
@ -17,8 +17,8 @@ Who have I talked to today, are there any follow-ups or learnings?
|
||||||
## Cloud foundry/Paketo
|
## Cloud foundry/Paketo
|
||||||
|
|
||||||
* We mostly had some smalltalk
|
* We mostly had some smalltalk
|
||||||
* There will be a cloudfoundry day in Karlsruhe in October, they'd be happy to have us ther
|
* There will be a cloud foundry day in Karlsruhe in October, they'd be happy to have us there
|
||||||
* The whole KORFI (Cloudfoundry on Kubernetes) Project is still going strong, but no release canidate yet (or in the near future)
|
* The whole KORFI (Cloud foundry on Kubernetes) Project is still going strong, but no release candidate yet (or in the near future)
|
||||||
|
|
||||||
## Traefik
|
## Traefik
|
||||||
|
|
||||||
|
@ -31,7 +31,7 @@ They will follow up
|
||||||
## Postman
|
## Postman
|
||||||
|
|
||||||
* I asked them about their new cloud-only stuff: They will keep their direction
|
* I asked them about their new cloud-only stuff: They will keep their direction
|
||||||
* The are also planning to work on info materials on why postman SaaS is not a big security risk
|
* They are also planning to work on info materials on why postman SaaS is not a big security risk
|
||||||
|
|
||||||
## Mattermost
|
## Mattermost
|
||||||
|
|
||||||
|
@ -39,9 +39,9 @@ They will follow up
|
||||||
I should follow up
|
I should follow up
|
||||||
{{% /notice %}}
|
{{% /notice %}}
|
||||||
|
|
||||||
* I talked about our problems with the mattermost operator and was asked to get back to them with the errors
|
* I talked about our problems with the Mattermost operator and was asked to get back to them with the errors
|
||||||
* They're currently migrating the mattermost cloud offering to arm - therefor arm support will be coming in the next months
|
* They're currently migrating the Mattermost cloud offering to arm - therefor arm support will be coming in the next months
|
||||||
* The mattermost guy had exactly the same problems with notifications and read/unread using element
|
* The Mattermost guy had exactly the same problems with notifications and read/unread using element
|
||||||
|
|
||||||
## Vercel
|
## Vercel
|
||||||
|
|
||||||
|
@ -63,11 +63,11 @@ I should follow up
|
||||||
They will follow up with a quick demo
|
They will follow up with a quick demo
|
||||||
{{% /notice %}}
|
{{% /notice %}}
|
||||||
|
|
||||||
* A kubernetes security/runtime security solution with pretty nice looking urgency filters
|
* A Kubernetes security/runtime security solution with pretty nice looking urgency filters
|
||||||
* Includes eBPF to see what code actually runs
|
* Includes eBPF to see what code actually runs
|
||||||
* I'll witness a demo in early/mid april
|
* I'll witness a demo in early/mid April
|
||||||
|
|
||||||
### Isovalent
|
### Isovalent
|
||||||
|
|
||||||
* Dinner (very tasty)
|
* Dinner (very tasty)
|
||||||
* Cilium still sounds like the way to go in regards to CNIs
|
* Cilium still sounds like the way to go in regard to CNIs
|
||||||
|
|
|
@ -5,7 +5,7 @@ weight: 2
|
||||||
---
|
---
|
||||||
|
|
||||||
Day two is also the official day one of KubeCon (Day one was just CloudNativeCon).
|
Day two is also the official day one of KubeCon (Day one was just CloudNativeCon).
|
||||||
This is where all of the people joined (over 2000)
|
This is where all the people joined (over 12000)
|
||||||
|
|
||||||
The opening keynotes were a mix of talks and panel discussions.
|
The opening keynotes were a mix of talks and panel discussions.
|
||||||
The main topic was - who could have guessed - AI and ML.
|
The main topic was - who could have guessed - AI and ML.
|
||||||
|
|
|
@ -11,8 +11,8 @@ A talk by Google and Ivanti.
|
||||||
|
|
||||||
## Background
|
## Background
|
||||||
|
|
||||||
* RBAC is ther to limit information access and control
|
* RBAC is there to limit information access and control
|
||||||
* RBAC can be used to avoid interfearance in shared envs
|
* RBAC can be used to avoid interference in shared envs
|
||||||
* DNS is not really applicable when it comes to RBAC
|
* DNS is not really applicable when it comes to RBAC
|
||||||
|
|
||||||
### DNS in Kubernetes
|
### DNS in Kubernetes
|
||||||
|
@ -26,11 +26,11 @@ A talk by Google and Ivanti.
|
||||||
|
|
||||||
* Specially for smaller, high growth companies with infinite VC money
|
* Specially for smaller, high growth companies with infinite VC money
|
||||||
* Just give everyone their own cluster -> Problem solved
|
* Just give everyone their own cluster -> Problem solved
|
||||||
* Smaller (<1000) typicly use many small clusters
|
* Smaller (<1000) typically use many small clusters
|
||||||
|
|
||||||
### Shared Clusters
|
### Shared Clusters
|
||||||
|
|
||||||
* Becomes imporetant when cost is a question and engineers don't have any platform knowledge
|
* Becomes important when cost is a question and engineers don't have any platform knowledge
|
||||||
* A dedicated kube team can optimize both hardware and deliver updates fast -> Increased productivity by utilizing specialists
|
* A dedicated kube team can optimize both hardware and deliver updates fast -> Increased productivity by utilizing specialists
|
||||||
* Problem: Noisy neighbors by leaky DNS
|
* Problem: Noisy neighbors by leaky DNS
|
||||||
|
|
||||||
|
@ -45,14 +45,14 @@ A talk by Google and Ivanti.
|
||||||
### Leak mechanics
|
### Leak mechanics
|
||||||
|
|
||||||
* Leaks are based on the `<service>.<nemspace>.<svc>.cluster.local` pattern
|
* Leaks are based on the `<service>.<nemspace>.<svc>.cluster.local` pattern
|
||||||
* You can also just reverse looku the entire service CIDR
|
* You can also just reverse lookup the entire service CIDR
|
||||||
* SRV records get created for each service including the service ports
|
* SRV records get created for each service including the service ports
|
||||||
|
|
||||||
## Fix the leak
|
## Fix the leak
|
||||||
|
|
||||||
### CoreDNS Firewall Plugin
|
### CoreDNS Firewall Plugin
|
||||||
|
|
||||||
* External plugin provided by the coredns team
|
* External plugin provided by the CoreDNS team
|
||||||
* Expression engine built-in with support for external policy engines
|
* Expression engine built-in with support for external policy engines
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
@ -67,19 +67,19 @@ flowchart LR
|
||||||
|
|
||||||
### Demo
|
### Demo
|
||||||
|
|
||||||
* Firwall rule that only allows queries from the same namespace, kube-system or default
|
* Firewall rule that only allows queries from the same namespace, `kube-system` or `default`
|
||||||
* Every other cross-namespace request gets blocked
|
* Every other cross-namespace request gets blocked
|
||||||
* Same SVC requests from before now return NXDOMAIN
|
* Same SVC requests from before now return `NXDOMAIN`
|
||||||
|
|
||||||
### Why is this a plugin and not default?
|
### Why is this a plugin and not default?
|
||||||
|
|
||||||
* Requires `pods verified` mode -> Puts the watch on pods and only returns a query result if the pod actually exists
|
* Requires `pods verified` mode -> Puts the watch on pods and only returns a query result if the pod actually exists
|
||||||
* Puts a watch on all pods -> higher API load and coredns mem usage
|
* Puts a watch on all pods -> higher API load and CoreDNS memory usage
|
||||||
* Potential race conditions with initial lookups in larger clusters -> Alternative is to fail open (not really secure)
|
* Potential race conditions with initial lookups in larger clusters -> Alternative is to fail open (not really secure)
|
||||||
|
|
||||||
### Per tenant DNS
|
### Per tenant DNS
|
||||||
|
|
||||||
* Just run a cporedns instance for each tenant
|
* Just run a CoreDNS instance for each tenant
|
||||||
* Use a mutating webhook to inject the right dns into each pod
|
* Use a mutating webhook to inject the right DNS into each pod
|
||||||
* Pro: No more pods verified -> Aka no more constant watch
|
* Pro: No more pods verified -> Aka no more constant watch
|
||||||
* Limitation: Platform services still need a central coredns
|
* Limitation: Platform services still need a central CoreDNS
|
||||||
|
|
|
@ -6,7 +6,7 @@ tags:
|
||||||
- dx
|
- dx
|
||||||
---
|
---
|
||||||
|
|
||||||
Mitch from aviatrix -a former software engineer who has now switched over to product managment.
|
Mitch from aviatrix -a former software engineer who has now switched over to product management.
|
||||||
|
|
||||||
## Opening Thesis
|
## Opening Thesis
|
||||||
|
|
||||||
|
@ -14,19 +14,19 @@ Opening with the Atari 2600 E.T. game as very bad fit sample.
|
||||||
Thesis: Missing user empathy
|
Thesis: Missing user empathy
|
||||||
|
|
||||||
* A very hard game aimed at children without the will to trail and error
|
* A very hard game aimed at children without the will to trail and error
|
||||||
* Other aspect: Some of the devalopers were pulled together from throughout the company -> No passion needed
|
* Other aspect: Some devalopers were pulled together from throughout the company -> No passion needed
|
||||||
|
|
||||||
### Another sample
|
### Another sample
|
||||||
|
|
||||||
* Idea: SCADA system with sensors that can be moved and the current location get's tracked via iPad.
|
* Idea: SCADA system with sensors that can be moved, and the current location gets tracked via iPad.
|
||||||
* Result: Nobody used the iPad app - only the desktop webapp
|
* Result: Nobody used the iPad app - only the desktop Web-app
|
||||||
* Problem: Sensor get's moved, location not updated, the measurements for the wrong location get reported until update
|
* Problem: Sensor gets moved, location not updated, the measurements for the wrong location get reported until update
|
||||||
* Source: Moving a sensor is a pretty involved process including high pressure aka no priority for iPad
|
* Source: Moving a sensor is a pretty involved process including high pressure aka no priority for iPad
|
||||||
* Empathy loss: Different working endvironments result in drastic work experience missmatch
|
* Empathy loss: Different working environments result in drastic work experience mismatch
|
||||||
|
|
||||||
## The source
|
## The source
|
||||||
|
|
||||||
* Idea: A software engineer writes software, that someone else has to use, not themselfes
|
* Idea: A software engineer writes software, that someone else has to use, not themselves
|
||||||
* Problem: Distance between user and dev is high and their perspectives differ heavily
|
* Problem: Distance between user and dev is high and their perspectives differ heavily
|
||||||
|
|
||||||
## User empathy
|
## User empathy
|
||||||
|
@ -37,37 +37,37 @@ Thesis: Missing user empathy
|
||||||
## Stories from Istio
|
## Stories from Istio
|
||||||
|
|
||||||
* Classic implementation: Sidecar Proxy
|
* Classic implementation: Sidecar Proxy
|
||||||
* Question: Can the same value be provided without a sidecar anywhers
|
* Question: Can the same value be provided without a sidecar anywhere
|
||||||
* Answer: Ambient mode -> split into l4 (proxy per node) and l7 (no sharing)
|
* Answer: Ambient mode -> split into l4 (proxy per node) and l7 (no sharing)
|
||||||
* Problem: After alpha release ther was a lack of exitement and feedback
|
* Problem: After alpha release there was a lack of excitement and feedback
|
||||||
* Result: Twitter Space event for feedback
|
* Result: Twitter Space event for feedback
|
||||||
|
|
||||||
### Ideas and feedback
|
### Ideas and feedback
|
||||||
|
|
||||||
* Idea: Sidecar is somewhat magical
|
* Idea: Sidecar is somewhat magical
|
||||||
* Feedback: Sidecars are a pain, but after integrating istio can be automated -> a problem gets solved, that already had a solution
|
* Feedback: Sidecars are a pain, but after integrating Istio can be automated -> a problem gets solved, that already had a solution
|
||||||
* Result: Highly overvalued the pain of sidecars
|
* Result: Highly overvalued the pain of sidecars
|
||||||
* Idea: Building istio into a platform sounds easy
|
* Idea: Building Istio into a platform sounds easy
|
||||||
* Feedback: The platform has to be changed for the new ambient mode -> High time investment while engineers are hard
|
* Feedback: The platform has to be changed for the new ambient mode -> High time investment while engineers are hard
|
||||||
* Result: The cost of platform changes was highly undervalued
|
* Result: The cost of platform changes was highly undervalued
|
||||||
* Idea: Sidecar compute sound expensive and networking itself pretty cheap
|
* Idea: Sidecar compute sound expensive and networking itself pretty cheap
|
||||||
* Feedback: Many users have multi-region clusters -> Egress is whery expoenive
|
* Feedback: Many users have multi-region clusters -> Egress is very expensive
|
||||||
* Result: The relation between compute and egress cost was pretty much swapped
|
* Result: The relation between compute and egress cost was pretty much swapped
|
||||||
|
|
||||||
### What now?
|
### What now?
|
||||||
|
|
||||||
* Ambient is the right solution for new users (fresh service mesehes)
|
* Ambient is the right solution for new users (fresh service meshes)
|
||||||
* Existing users probaly won't upgrade
|
* Existing users probably won't upgrade
|
||||||
* Result: They will move forward with ambient mdoe
|
* Result: They will move forward with ambient mode
|
||||||
|
|
||||||
## So what did we lern
|
## So what did we learn
|
||||||
|
|
||||||
### Basic questions
|
### Basic questions
|
||||||
|
|
||||||
* Who are my intended users?
|
* Who are my intended users?
|
||||||
* What exites/worries them?
|
* What excites/worries them?
|
||||||
* What do they find easy/hard?
|
* What do they find easy/hard?
|
||||||
* What is ther biggest expense and what is inexpensive?
|
* What is the biggest expense and what is inexpensive?
|
||||||
|
|
||||||
### How to get better empathy
|
### How to get better empathy
|
||||||
|
|
||||||
|
|
|
@ -6,25 +6,25 @@ tags:
|
||||||
- business
|
- business
|
||||||
---
|
---
|
||||||
|
|
||||||
Bob a Program Manager at Google and Kubernetes steering commitee member with a bunch of contributor and maintainer experience.
|
Bob a Program Manager at Google and Kubernetes steering committee member with a bunch of contributor and maintainer experience.
|
||||||
The value should be rated even higher than the pure business value.
|
The value should be rated even higher than the pure business value.
|
||||||
|
|
||||||
## Baseline
|
## Baseline
|
||||||
|
|
||||||
* A öarge chunk of CNCF contrinbutors and maintainers (95%) are company affiliated
|
* A large chunk of CNCF contributors and maintainers (95%) are company affiliated
|
||||||
* Most (50%) of the people contributed in professional an personal time )(and 30 only on work time)
|
* Most (50%) of the people contributed in professional personal time (and 30 only on work time)
|
||||||
* Explaining business value can be very complex
|
* Explaining business value can be very complex
|
||||||
* Base question: What does this contribute to the business
|
* Base question: What does this contribute to the business
|
||||||
|
|
||||||
## Data enablement
|
## Data enablement
|
||||||
|
|
||||||
* Problem: Insufficient data (data collection is often an afterthought)
|
* Problem: Insufficient data (data collection is often an afterthought)
|
||||||
* Example used: Random CNCF slection
|
* Example used: Random CNCF selection
|
||||||
* 50% of issues are labed consistentöy
|
* 50% of issues are labeled consistently
|
||||||
* 17% of projects label PRs
|
* 17% of projects label PRs
|
||||||
* 58% of projects use milestones
|
* 58% of projects use milestones
|
||||||
* Labels provide: Context, Prioritization, Scope, State
|
* Labels provide: Context, Prioritization, Scope, State
|
||||||
* Milestones enable: Filtering outside of daterange
|
* Milestones enable: Filtering outside date range
|
||||||
* Sample queries:
|
* Sample queries:
|
||||||
* How many features have been in milestone XY?
|
* How many features have been in milestone XY?
|
||||||
* How many bugs have been fixed in this version?
|
* How many bugs have been fixed in this version?
|
||||||
|
@ -37,36 +37,36 @@ The value should be rated even higher than the pure business value.
|
||||||
* Thought of as overhead
|
* Thought of as overhead
|
||||||
* Project is too small
|
* Project is too small
|
||||||
* Tools:
|
* Tools:
|
||||||
* Actions/Pipelines for autolabel, copy label sync labels
|
* Actions/Pipelines for auto-label, copy label sync labels
|
||||||
* Prow: The label system for kubernetes projects
|
* Prow: The label system for Kubernetes projects
|
||||||
* People with high project but low code knowlege can triage -> Make them feel recognized
|
* People with high project, but low code knowledge can triage -> Make them feel recognized
|
||||||
|
|
||||||
### Conclusions
|
### Conclusions
|
||||||
|
|
||||||
* Consistent labels & milestones are critical for state analysis
|
* Consistent labels & milestones are critical for state analysis
|
||||||
* Data is the evidence needed in messaging for leadershiü
|
* Data is the evidence needed in messaging for leadership
|
||||||
* Recruting triage-specific people and using automations streamlines the process
|
* Recruiting triage-specific people and using automations streamlines the process
|
||||||
|
|
||||||
## Communication
|
## Communication
|
||||||
|
|
||||||
### Personas
|
### Personas
|
||||||
|
|
||||||
* OSS enthusiast: Knows the ecosystem and project with a knack for discussions and deep dives
|
* OSS enthusiast: Knows the ecosystem and project with a knack for discussions and deep dives
|
||||||
* Maintainer;: A enthusiast that is tired, unter pressure and most of the time a one-man show that would prefer doint thechnical stuff
|
* Maintainer;: A enthusiast that is tired, under pressure and most of the time a one-man show that would prefer doing technical stuff
|
||||||
* CXO: Focus on ressources, health, ROI
|
* CXO: Focus on resources, health, ROI
|
||||||
* Product manager: Get the best project, user friendly
|
* Product manager: Get the best project, user-friendly
|
||||||
* Leads: Employees should meet KPIs, with slightly better techn understanding
|
* Leads: Employees should meet KPIs, with slightly better tech understanding
|
||||||
* End user: How can tools/features help me
|
* End user: How can tools/features help me
|
||||||
|
|
||||||
### Growth limits
|
### Growth limits
|
||||||
|
|
||||||
* Main questions:
|
* Main questions:
|
||||||
* What is theis project/feature
|
* What is this project/feature
|
||||||
* Where is the roadmap
|
* Where is the roadmap
|
||||||
* What parts of the project are at risk?
|
* What parts of the project are at risk?
|
||||||
* Problem: Wording
|
* Problem: Wording
|
||||||
|
|
||||||
### Ways of surfcing information
|
### Ways of surfacing information
|
||||||
|
|
||||||
* Regular project reports/blog posts
|
* Regular project reports/blog posts
|
||||||
* Roadmap on website
|
* Roadmap on website
|
||||||
|
@ -76,8 +76,8 @@ The value should be rated even higher than the pure business value.
|
||||||
|
|
||||||
* What are we getting out? (How fast are bugs getting fixed)
|
* What are we getting out? (How fast are bugs getting fixed)
|
||||||
* What is the criticality of the project?
|
* What is the criticality of the project?
|
||||||
* How much time is spent on maintainance?
|
* How much time is spent on maintenance?
|
||||||
|
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
* Ther is significant unrealized valze in open source
|
* There is significant unrealized value in open source
|
||||||
|
|
|
@ -10,7 +10,7 @@ A talk about the backstage documentation audit and what makes a good documentati
|
||||||
|
|
||||||
## Opening
|
## Opening
|
||||||
|
|
||||||
* 2012 the year of the mayan calendar and the mainstream success of memes
|
* 2012 the year of the Maya calendar and the mainstream success of memes
|
||||||
* The classic meme RTFM -> Classic manuals were pretty long
|
* The classic meme RTFM -> Classic manuals were pretty long
|
||||||
* 2024: Manuals have become documentation (hopefully with better contents)
|
* 2024: Manuals have become documentation (hopefully with better contents)
|
||||||
|
|
||||||
|
@ -18,9 +18,9 @@ A talk about the backstage documentation audit and what makes a good documentati
|
||||||
|
|
||||||
### What is documentation
|
### What is documentation
|
||||||
|
|
||||||
* Docs (the raw descriptions, qucikstart and how-to)
|
* Docs (the raw descriptions, quick-start and how-to)
|
||||||
* Website (the first impression - what does this do, why would i need it)
|
* Website (the first impression - what does this do, why would I need it)
|
||||||
* REAMDE (the github way of website + docs)
|
* README (the GitHub way of website + docs)
|
||||||
* CONTRIBUTING (Is this a one-man show)
|
* CONTRIBUTING (Is this a one-man show)
|
||||||
* Issues
|
* Issues
|
||||||
* Meta docs (how do we orchestrate things)
|
* Meta docs (how do we orchestrate things)
|
||||||
|
@ -30,10 +30,10 @@ A talk about the backstage documentation audit and what makes a good documentati
|
||||||
* Who needs this documentation?
|
* Who needs this documentation?
|
||||||
* New users -> Optimize for minimum context
|
* New users -> Optimize for minimum context
|
||||||
* Experienced users
|
* Experienced users
|
||||||
* User roles (Admins, end users, ...) -> Seperate into different pages (Get started based in your role)
|
* User roles (Admins, end users, ...) -> Separate into different pages (Get started based in your role)
|
||||||
* What do we need to enable with this documentation?
|
* What do we need to enable with this documentation?
|
||||||
* Prove value fast -> Why this project?
|
* Prove value fast -> Why this project?
|
||||||
* Educate on fundemental aspects
|
* Educate on fundamental aspects
|
||||||
* Showcase features/uses cases
|
* Showcase features/uses cases
|
||||||
* Hands-on enablement -> Tutorials, guides, step-by-step
|
* Hands-on enablement -> Tutorials, guides, step-by-step
|
||||||
|
|
||||||
|
@ -43,24 +43,24 @@ A talk about the backstage documentation audit and what makes a good documentati
|
||||||
* Documented scheduled contributor meetings
|
* Documented scheduled contributor meetings
|
||||||
* Getting started guides for new contributors
|
* Getting started guides for new contributors
|
||||||
* Project governance
|
* Project governance
|
||||||
* Who is gonna own it?
|
* Who is going to own it?
|
||||||
* What will happen to my PR?
|
* What will happen to my PR?
|
||||||
* Who maintains features?
|
* Who maintains features?
|
||||||
|
|
||||||
### Website
|
### Website
|
||||||
|
|
||||||
* Single source for all pages (one repo that includes landing, docs, api and so on) -> Easier to contribute
|
* Single source for all pages (one repo that includes landing, docs, API and so on) -> Easier to contribute
|
||||||
* Usability (especially on mobile)
|
* Usability (especially on mobile)
|
||||||
* Social proof and case studies -> Develop trust
|
* Social proof and case studies -> Develop trust
|
||||||
* SEO (actually get found) and analytics (detect how documentation is used and where people leave)
|
* SEO (actually get found) and analytics (detect how documentation is used and where people leave)
|
||||||
* Plan website maintenance
|
* Plan website maintenance
|
||||||
|
|
||||||
### What is great documetnation
|
### What is great documentation
|
||||||
|
|
||||||
* Project docs helps users according to their needs -> Low question to answer latency
|
* Project docs help users according to their needs -> Low question to answer latency
|
||||||
* Contributor docs enables contributions in a predictable manner -> Don't leave "when will this be reviewed/mered" questions open
|
* Contributor docs enables contributions predictably -> Don't leave "when will this be reviewed/merged" questions open
|
||||||
* Website proves why anyone should invest time in this projects?
|
* Website proves why anyone should invest time in these projects?
|
||||||
* All documetnation is connected and up to date
|
* All documentation is connected and up to date
|
||||||
|
|
||||||
## General best practices
|
## General best practices
|
||||||
|
|
||||||
|
@ -72,11 +72,11 @@ A talk about the backstage documentation audit and what makes a good documentati
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
* Opentelemetry: Split by role (dev, ops)
|
* OpenTelemetry: Split by role (dev, ops)
|
||||||
* Prometheus:
|
* Prometheus:
|
||||||
* New user conent in intro (concept) and getting started (practice)
|
* New user content in intro (concept) and getting started (practice)
|
||||||
* Hierarchie includes concepts, key features and guides/tutorials
|
* Hierarchies includes concepts, key features and guides/tutorials
|
||||||
|
|
||||||
## Q&A
|
## Q&A
|
||||||
|
|
||||||
* Every last wednesday in the month is a cncf echnical writers meetin (cncf slack -> techdocs)
|
* Every last Wednesday in the month is a CNCF technical writers meeting (CNCF slack -> `#techdocs`)
|
||||||
|
|
|
@ -9,11 +9,11 @@ tags:
|
||||||
A talk by Broadcom and Bloomberg (both related to buildpacks.io).
|
A talk by Broadcom and Bloomberg (both related to buildpacks.io).
|
||||||
And a very full talk at that.
|
And a very full talk at that.
|
||||||
|
|
||||||
## Baselinbe
|
## Baseline
|
||||||
|
|
||||||
* CN Buildpack provides the spec for buildpacks with a couple of different implementations
|
* CN Buildpack provides the spec for buildpacks with a couple of different implementations
|
||||||
* Pack CLI with builder (collection of buildopacks - for example ppaketo or heroku)
|
* Pack CLI with builder (collection of Buildpacks - for example Paketo or Heroku)
|
||||||
* Output images follow oci -> Just run them on docker/podman/kubernetes
|
* Output images follow OCI -> Just run them on docker/Podman/Kubernetes
|
||||||
* Built images are `production application images` (small attack surface, SBOM, non-root, reproducible)
|
* Built images are `production application images` (small attack surface, SBOM, non-root, reproducible)
|
||||||
|
|
||||||
## Scaling
|
## Scaling
|
||||||
|
@ -47,7 +47,7 @@ flowchart LR
|
||||||
|
|
||||||
* Goal: Just a simple docker full that auto-detects the right architecture
|
* Goal: Just a simple docker full that auto-detects the right architecture
|
||||||
* Needed: Pack, Lifecycle, Buildpacks, Build images, builders, registry
|
* Needed: Pack, Lifecycle, Buildpacks, Build images, builders, registry
|
||||||
* Current state: There is an RFC to handle image index creation with changes to buildpack creation
|
* Current state: There is an RFC to handle image index creation with changes to Buildpack creation
|
||||||
* New folder structure for binaries
|
* New folder structure for binaries
|
||||||
* Update config files to include targets
|
* Update config files to include targets
|
||||||
* The user impact is minimal, because the builder abstracts everything away
|
* The user impact is minimal, because the builder abstracts everything away
|
||||||
|
@ -56,5 +56,5 @@ flowchart LR
|
||||||
|
|
||||||
* kpack is slsa.dev v3 compliant (party hard)
|
* kpack is slsa.dev v3 compliant (party hard)
|
||||||
* 5 years of production
|
* 5 years of production
|
||||||
* scaling up to tanzu/heroku/gcp levels
|
* scaling up to Tanzu/Heroku/GCP levels
|
||||||
* Multiarch is being worked on
|
* Multiarch is being worked on
|
||||||
|
|
|
@ -4,4 +4,4 @@ title: Day 3
|
||||||
weight: 3
|
weight: 3
|
||||||
---
|
---
|
||||||
|
|
||||||
Spent most of the early day with headache therefor talk notes only start at noon.
|
Spent most of the early day with headache therefore talk notes only start at noon.
|
||||||
|
|
|
@ -9,11 +9,11 @@ tags:
|
||||||
## Problems
|
## Problems
|
||||||
|
|
||||||
* Dockerfiles are hard and not 100% reproducible
|
* Dockerfiles are hard and not 100% reproducible
|
||||||
* Buildpoacks are reproducible but result in large single-arch images
|
* Buildpacks are reproducible but result in large single-arch images
|
||||||
* Nix has multiple ways of doing things
|
* Nix has multiple ways of doing things
|
||||||
|
|
||||||
## Solutions
|
## Solutions
|
||||||
|
|
||||||
* Degger as a CI solution
|
* Dagger as a CI solution
|
||||||
* Multistage docker images with distroless -> Small image, small attack surcface
|
* Multistage docker images with distroless -> Small image, small attack surface
|
||||||
* Language specific solutions (ki, jib)
|
* Language specific solutions (`ki`, `jib`)
|
||||||
|
|
|
@ -5,7 +5,7 @@ tags:
|
||||||
- ebpf
|
- ebpf
|
||||||
---
|
---
|
||||||
|
|
||||||
A talk by isovalent with a full room (one of the large ones).
|
A talk by Isovalent with a full room (one of the large ones).
|
||||||
|
|
||||||
## Baseline
|
## Baseline
|
||||||
|
|
||||||
|
@ -19,9 +19,9 @@ A talk by isovalent with a full room (one of the large ones).
|
||||||
* Principles
|
* Principles
|
||||||
* Read memory only with correct permissions
|
* Read memory only with correct permissions
|
||||||
* All writes to valid and safe memory
|
* All writes to valid and safe memory
|
||||||
* Valid in-bounds and well formed control flow
|
* Valid in-bounds and well-formed control flow
|
||||||
* Execution on-cpu time is bounded: sleep, scheduled callbacks, interations, program acutally compketes
|
* Execution on CPU time is bounded: sleep, scheduled callbacks, iterations, program actually completes
|
||||||
* Aquire/release and reference count semantics
|
* Acquire/release and reference count semantics
|
||||||
|
|
||||||
## Demo: Game of life
|
## Demo: Game of life
|
||||||
|
|
||||||
|
@ -34,7 +34,7 @@ A talk by isovalent with a full room (one of the large ones).
|
||||||
|
|
||||||
* Instruction limit to let the verifier actually verify the program in reasonable time
|
* Instruction limit to let the verifier actually verify the program in reasonable time
|
||||||
* Limit is based on: Instruction limit and verifier step limit
|
* Limit is based on: Instruction limit and verifier step limit
|
||||||
* nowadays the limit it 4096 unprivileged calls and 1 million privileged istructions
|
* nowadays the limit it 4096 unprivileged calls and 1 million privileged instructions
|
||||||
* Only jump forward -> No loops
|
* Only jump forward -> No loops
|
||||||
* Is a basic limitation to ensure no infinite loops can ruin the day
|
* Is a basic limitation to ensure no infinite loops can ruin the day
|
||||||
* Limitation: Only finite iterations can be performed
|
* Limitation: Only finite iterations can be performed
|
||||||
|
@ -43,8 +43,8 @@ A talk by isovalent with a full room (one of the large ones).
|
||||||
* Solution: subprogram (aka function) and the limit is only for each function -> `x*subprogramms = x*limit`
|
* Solution: subprogram (aka function) and the limit is only for each function -> `x*subprogramms = x*limit`
|
||||||
* Limit: Needs real skill
|
* Limit: Needs real skill
|
||||||
* Programs have to terminate
|
* Programs have to terminate
|
||||||
* Well eBPF really only wants to release the cpu, the program doesn't have to end per se
|
* Well eBPF really only wants to release the CPU, the program doesn't have to end per se
|
||||||
* Iterator: walk abitrary lists of objects
|
* Iterator: walk arbitrary lists of objects
|
||||||
* Sleep on page fault or other memory operations
|
* Sleep on page fault or other memory operations
|
||||||
* Timer callbacks (including the timer 0 for run me asap)
|
* Timer callbacks (including the timer 0 for run me asap)
|
||||||
* Memory allocation
|
* Memory allocation
|
||||||
|
@ -52,5 +52,5 @@ A talk by isovalent with a full room (one of the large ones).
|
||||||
|
|
||||||
## Result
|
## Result
|
||||||
|
|
||||||
* You can execure abitrary tasks via eBPF
|
* You can execute arbitrary tasks via eBPF
|
||||||
* It can be used for HTTP or TLS - it's just not implemented yet™
|
* It can be used for HTTP or TLS - it's just not implemented yet™
|
||||||
|
|
|
@ -7,20 +7,20 @@ tags:
|
||||||
- scaling
|
- scaling
|
||||||
---
|
---
|
||||||
|
|
||||||
By the nice opertor framework guys at IBM and RedHat.
|
By the nice operator framework guys at IBM and Red Hat.
|
||||||
I'll skip the baseline introduction of what an operator is.
|
I'll skip the baseline introduction of what an operator is.
|
||||||
|
|
||||||
## Operator DSK
|
## Operator DSK
|
||||||
|
|
||||||
> Build the operator
|
> Build the operator
|
||||||
|
|
||||||
* Kubebuilder with v4 Plugines -> Supports the latest Kubernetes
|
* Kubebuilder with v4 Plugins -> Supports the latest Kubernetes
|
||||||
* Java Operator SDK is not a part of Operator SDK and they released 5.0.0
|
* Java Operator SDK is not a part of Operator SDK, and they released 5.0.0
|
||||||
* Now with server side apply in the background
|
* Now with server side apply in the background
|
||||||
* Better status updates and finalizer handling
|
* Better status updates and finalizer handling
|
||||||
* Dependent ressource handling (alongside optional dependent ressources)
|
* Dependent resource handling (alongside optional dependent resources)
|
||||||
|
|
||||||
## Operator Liefecycle Manager
|
## Operator Lifecycle Manager
|
||||||
|
|
||||||
> Manage the operator -> A operator for installing operators
|
> Manage the operator -> A operator for installing operators
|
||||||
|
|
||||||
|
@ -28,16 +28,16 @@ I'll skip the baseline introduction of what an operator is.
|
||||||
|
|
||||||
* New API Set -> The old CRDs were overwhelming
|
* New API Set -> The old CRDs were overwhelming
|
||||||
* More GitOps friendly with per-tenant support
|
* More GitOps friendly with per-tenant support
|
||||||
* Prediscribes update paths (maybe upgrade)
|
* Prescribes update paths (maybe upgrade)
|
||||||
* Suport for operator bundels as k8s manifests/helmchart
|
* Support for operator bundles as k8s manifests/helm chart
|
||||||
|
|
||||||
### OLM v1 Components
|
### OLM v1 Components
|
||||||
|
|
||||||
* Cluster Extension (User-Facing API)
|
* Cluster Extension (User-Facing API)
|
||||||
* Defines the app you want to install
|
* Defines the app you want to install
|
||||||
* Resolvs requirements through catalogd/depply
|
* Resolves requirements through CatalogD/depply
|
||||||
* Catalogd (Catalog Server/Operator)
|
* CatalogD (Catalog Server/Operator)
|
||||||
* Depply (Dependency/Contraint solver)
|
* Depply (Dependency/Constraint solver)
|
||||||
* Applier (Rukoak/kapp compatible)
|
* Applier (Rukoak/kapp compatible)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|
|
@ -12,15 +12,15 @@ Humor is present, but the main focus is still thetechnical integration
|
||||||
|
|
||||||
## Baseline
|
## Baseline
|
||||||
|
|
||||||
* Certmanager is the best™ way of getting certificats
|
* Cert manager is the best™ way of getting certificates
|
||||||
* Poster features: Autorenewal, ACME, PKI, HC Vault
|
* Poster features: Auto-renewal, ACME, PKI, HC Vault
|
||||||
* Numbers: 20M downloads 427 contributors 11.3 GitHub stars
|
* Numbers: 20M downloads 427 contributors 11.3 GitHub stars
|
||||||
* Currently on the gratuation path
|
* Currently on the graduation path
|
||||||
|
|
||||||
## History
|
## History
|
||||||
|
|
||||||
* 2016: Jetstack created kube-lego -> A operator that generated LE certificates for ingress based on annotations
|
* 2016: Jetstack created kube-lego -> A operator that generated LE certificates for ingress based on annotations
|
||||||
* 2o17: Certmanager launch -> Cert ressources and issuer ressources
|
* 2o17: Cert manager launch -> Cert resources and issuer resources
|
||||||
* 2020: v1.0.0 and joined CNCF sandbox
|
* 2020: v1.0.0 and joined CNCF sandbox
|
||||||
* 2022: CNCF incubating
|
* 2022: CNCF incubating
|
||||||
* 2024: Passed the CNCF security audit and on the way to graduation
|
* 2024: Passed the CNCF security audit and on the way to graduation
|
||||||
|
@ -30,16 +30,16 @@ Humor is present, but the main focus is still thetechnical integration
|
||||||
### How it came to be
|
### How it came to be
|
||||||
|
|
||||||
* The idea: Mix the digital certificate with the classical seal
|
* The idea: Mix the digital certificate with the classical seal
|
||||||
* Started as the stamping idea to celebrate v1 and send contributors a thank you with candels
|
* Started as the stamping idea to celebrate v1 and send contributors a thank you with candles
|
||||||
* Problems: Candels are not allowed -> Therefor glue gun
|
* Problems: Candles are not allowed -> Therefor glue gun
|
||||||
|
|
||||||
### How it works
|
### How it works
|
||||||
|
|
||||||
* Components
|
* Components
|
||||||
* RASPI with k3s
|
* Raspberry Pi with k3s
|
||||||
* Printer
|
* Printer
|
||||||
* Cert manager
|
* Cert manager
|
||||||
* A go-based webui
|
* A Go-based Web-UI
|
||||||
* QR-Code: Contains link to certificate with private key
|
* QR-Code: Contains link to certificate with private key
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
@ -53,14 +53,14 @@ flowchart LR
|
||||||
### What is new this year
|
### What is new this year
|
||||||
|
|
||||||
* Idea: Certs should be usable for TLS
|
* Idea: Certs should be usable for TLS
|
||||||
* Solution: The QR-Code links to a zip-download with the cert and provate key
|
* Solution: The QR-Code links to a zip-download with the cert and private key
|
||||||
* New: ECDSA for everything
|
* New: ECDSA for everything
|
||||||
* New: A stable root ca with intermediate for every conference
|
* New: A stable root ca with intermediate for every conference
|
||||||
* New: Guestbook that can only be signed with a booth issued certificate -> Available via script
|
* New: Guestbook that can only be signed with a booth issued certificate -> Available via script
|
||||||
|
|
||||||
## Learnings
|
## Learnings
|
||||||
|
|
||||||
* This demo is just a private CA with certmanager -> Can be applied to any PKI-usecase
|
* This demo is just a private CA with cert manager -> Can be applied to any PKI-usecases
|
||||||
* The certificate can be created via the CR, CSI driver (create secret and mount in container), ingress annotations, ...
|
* The certificate can be created via the CR, CSI driver (create secret and mount in container), ingress annotations, ...
|
||||||
* You can use multiple different Issuers (CA Issuer aka PKI, Let's Encrypt, Vault, AWS, ...)
|
* You can use multiple different Issuers (CA Issuer aka PKI, Let's Encrypt, Vault, AWS, ...)
|
||||||
|
|
||||||
|
@ -74,4 +74,4 @@ flowchart LR
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
* This is not just a demo -> Just apply it for machines
|
* This is not just a demo -> Just apply it for machines
|
||||||
* They have regular meetings (daily standups and bi-weekly)
|
* They have regular meetings (daily stand-ups and bi-weekly)
|
||||||
|
|
|
@ -7,14 +7,14 @@ tags:
|
||||||
- scaling
|
- scaling
|
||||||
---
|
---
|
||||||
|
|
||||||
A talk by TikTok/ByteDace (duh) focussed on using central controllers instead of on the edge.
|
A talk by TikTok/ByteDance (duh) focussed on using central controllers instead of on the edge.
|
||||||
|
|
||||||
## Background
|
## Background
|
||||||
|
|
||||||
> Global means non-china
|
> Global means non-china
|
||||||
|
|
||||||
* Edge platform team for cdn, livestreaming, uploads, realtime communication, etc.
|
* Edge platform team for CDN, livestreaming, uploads, real-time communication, etc.
|
||||||
* Around 250 cluster with 10-600 nodes each - mostly non-cloud aka baremetal
|
* Around 250 cluster with 10-600 nodes each - mostly non-cloud aka bare-metal
|
||||||
* Architecture: Control plane clusters (platform services) - data plane clusters (workload by other teams)
|
* Architecture: Control plane clusters (platform services) - data plane clusters (workload by other teams)
|
||||||
* Platform includes logs, metrics, configs, secrets, ...
|
* Platform includes logs, metrics, configs, secrets, ...
|
||||||
|
|
||||||
|
@ -24,28 +24,28 @@ A talk by TikTok/ByteDace (duh) focussed on using central controllers instead of
|
||||||
|
|
||||||
* Operators are essential for platform features
|
* Operators are essential for platform features
|
||||||
* As the feature requests increase, more operators are needed
|
* As the feature requests increase, more operators are needed
|
||||||
* The deployment of operators throughout many clusters is complex (namespace, deployments, pollicies, ...)
|
* The deployment of operators throughout many clusters is complex (namespace, deployments, policies, ...)
|
||||||
|
|
||||||
### Edge
|
### Edge
|
||||||
|
|
||||||
* Limited ressources
|
* Limited resources
|
||||||
* Cost implication of platfor features
|
* Cost implication of platform features
|
||||||
* Real time processing demands by platform features
|
* Real time processing demands by platform features
|
||||||
* Balancing act between ressorces used by workload vs platform features (20-25%)
|
* Balancing act between resources used by workload vs platform features (20-25%)
|
||||||
|
|
||||||
### The classic flow
|
### The classic flow
|
||||||
|
|
||||||
1. New feature get's requested
|
1. New feature gets requested
|
||||||
2. Use kube-buiders with the sdk to create the operator
|
2. Use kubebuider with the SDK to create the operator
|
||||||
3. Create namespaces and configs in all clusters
|
3. Create namespaces and configs in all clusters
|
||||||
4. Deploy operator to all clsuters
|
4. Deploy operator to all clusters
|
||||||
|
|
||||||
## Possible Solution
|
## Possible Solution
|
||||||
|
|
||||||
### Centralized Control Plane
|
### Centralized Control Plane
|
||||||
|
|
||||||
* Problem: The controller implementation is limited to a cluster boundry
|
* Problem: The controller implementation is limited to a cluster boundary
|
||||||
* Idea: Why not create a signle operator that can manage multiple edge clusters
|
* Idea: Why not create a single operator that can manage multiple edge clusters
|
||||||
* Implementation: Just modify kubebuilder to accept multiple clients (and caches)
|
* Implementation: Just modify kubebuilder to accept multiple clients (and caches)
|
||||||
* Result: It works -> Simpler deployment and troubleshooting
|
* Result: It works -> Simpler deployment and troubleshooting
|
||||||
* Concerns: High code complexity -> Long familiarization
|
* Concerns: High code complexity -> Long familiarization
|
||||||
|
@ -54,14 +54,14 @@ A talk by TikTok/ByteDace (duh) focussed on using central controllers instead of
|
||||||
### Attempt it a bit more like kubebuilder
|
### Attempt it a bit more like kubebuilder
|
||||||
|
|
||||||
* Each cluster has its own manager
|
* Each cluster has its own manager
|
||||||
* There is a central multimanager that starts all of the cluster specific manager
|
* There is a central multimanager that starts all the cluster specific manager
|
||||||
* Controller registration to the manager now handles cluster names
|
* Controller registration to the manager now handles cluster names
|
||||||
* The reconciler knows which cluster it is working on
|
* The reconciler knows which cluster it is working on
|
||||||
* The multi cluster management basicly just tets all of the cluster secrets and create a manager+controller for each cluster secret
|
* The multi cluster management basically just test all the cluster secrets and create a manager+controller for each cluster secret
|
||||||
* Challenges: Network connectifiy
|
* Challenges: Network connectivity
|
||||||
* Solutions:
|
* Solutions:
|
||||||
* Dynamic add/remove of clusters with go channels to prevent pod restarts
|
* Dynamic add/remove of clusters with go channels to prevent pod restarts
|
||||||
* Connectivity health checks -> For loss the recreate manager get's triggered
|
* Connectivity health checks -> For loss the `recreate manager` gets triggered
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
flowchart TD
|
flowchart TD
|
||||||
|
@ -80,7 +80,7 @@ flowchart LR
|
||||||
|
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
* Acknowlege ressource contrains on edge
|
* Acknowledge resource constraints on edge
|
||||||
* Embrace open source adoption instead of build your own
|
* Embrace open source adoption instead of build your own
|
||||||
* Simplify deployment
|
* Simplify deployment
|
||||||
* Recognize your own optionated approach and it's use cases
|
* Recognize your own opinionated approach and it's use cases
|
||||||
|
|
|
@ -15,22 +15,22 @@ Notes may be a bit unstructured due to tired note taker.
|
||||||
|
|
||||||
## Basics
|
## Basics
|
||||||
|
|
||||||
* Fluentbit is compatible with
|
* FluentBit is compatible with
|
||||||
* prometheus (It can replace the prometheus scraper and node exporter)
|
* Prometheus (It can replace the Prometheus scraper and node exporter)
|
||||||
* openmetrics
|
* OpenMetrics
|
||||||
* opentelemetry (HTTPS input/output)
|
* OpenTelemetry (HTTPS input/output)
|
||||||
* FluentBit can export to Prometheus, Splunk, InfluxDB or others
|
* FluentBit can export to Prometheus, Splunk, InfluxDB or others
|
||||||
* So pretty much it can be used to collect data from a bunch of sources and pipe it out to different backend destinations
|
* So pretty much it can be used to collect data from a bunch of sources and pipe it out to different backend destinations
|
||||||
* Fluent ecosystem: No vendor lock-in to observability
|
* Fluent ecosystem: No vendor lock-in to observability
|
||||||
|
|
||||||
### Arhitectures
|
### Architectures
|
||||||
|
|
||||||
* The fluent agent collects data and can send it to one or multiple locations
|
* The fluent agent collects data and can send it to one or multiple locations
|
||||||
* FluentBit can be used for aggregation from other sources
|
* FluentBit can be used for aggregation from other sources
|
||||||
|
|
||||||
### In the kubernetes logging ecosystem
|
### In the Kubernetes logging ecosystem
|
||||||
|
|
||||||
* Pods logs to console -> Streamed stdout/err gets piped to file
|
* Pod logs to console -> Streamed stdout/err gets piped to file
|
||||||
* The logs in the file get encoded as JSON with metadata (date, channel)
|
* The logs in the file get encoded as JSON with metadata (date, channel)
|
||||||
* Labels and annotations only live in the control plane -> You have to collect it additionally -> Expensive
|
* Labels and annotations only live in the control plane -> You have to collect it additionally -> Expensive
|
||||||
|
|
||||||
|
@ -56,8 +56,8 @@ flowchart LR
|
||||||
|
|
||||||
### Solution
|
### Solution
|
||||||
|
|
||||||
* Solution: Processor - a seperate thread segmented by telemetry type
|
* Solution: Processor - a separate thread segmented by telemetry type
|
||||||
* Plugins can be written in your favourite language /c, rust, go, ...)
|
* Plugins can be written in your favorite language (c, rust, go, ...)
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
flowchart LR
|
flowchart LR
|
||||||
|
@ -74,7 +74,7 @@ flowchart LR
|
||||||
### General new features in v3
|
### General new features in v3
|
||||||
|
|
||||||
* Native HTTP/2 support in core
|
* Native HTTP/2 support in core
|
||||||
* Contetn modifier with multiple operations (insert, upsert, delete, rename, hash, extract, convert)
|
* Content modifier with multiple operations (insert, upsert, delete, rename, hash, extract, convert)
|
||||||
* Metrics selector (include or exclude metrics) with matcher (name, prefix, substring, regex)
|
* Metrics selector (include or exclude metrics) with matcher (name, prefix, substring, regex)
|
||||||
* SQL processor -> Use SQL expression for selections (instead of filters)
|
* SQL processor -> Use SQL expression for selections (instead of filters)
|
||||||
* Better OpenTelemetry output
|
* Better OpenTelemetry output
|
||||||
|
|
|
@ -15,15 +15,15 @@ Who have I talked to today, are there any follow-ups or learnings?
|
||||||
They will follow up with a quick demo
|
They will follow up with a quick demo
|
||||||
{{% /notice %}}
|
{{% /notice %}}
|
||||||
|
|
||||||
* A interesting tektone-based CI/CD solutions that also integrates with oter platforms
|
* An interesting tektone-based CI/CD solutions that also integrates with other platforms
|
||||||
* May be interesting for either ODIT or some of our customers
|
* May be interesting for either ODIT.Services or some of our customers
|
||||||
|
|
||||||
## Docker
|
## Docker
|
||||||
|
|
||||||
* Talked to one salesperson just aboput the general conference
|
* Talked to one salesperson just about the general conference
|
||||||
* Talked to one technical guy about docker build time optimization
|
* Talked to one technical guy about docker build time optimization
|
||||||
|
|
||||||
## Rancher/Suse
|
## Rancher/SUSE
|
||||||
|
|
||||||
* I just got some swag, a friend of mine got a demo focussing on runtime security
|
* I just got some swag, a friend of mine got a demo focussing on runtime security
|
||||||
|
|
||||||
|
|
|
@ -4,7 +4,7 @@ title: Operators
|
||||||
|
|
||||||
## Observability
|
## Observability
|
||||||
|
|
||||||
* Export reconcile loop steps as opentelemetry traces
|
* Export reconcile loop steps as OpenTelemetry traces
|
||||||
|
|
||||||
## Work queue
|
## Work queue
|
||||||
|
|
||||||
|
|
|
@ -3,11 +3,11 @@ title: Flux
|
||||||
weight: 2
|
weight: 2
|
||||||
---
|
---
|
||||||
|
|
||||||
Some lessonslearned from flux talsk and from talking to the flux team.
|
Some lessons learned from flux talks and from talking to the flux team.
|
||||||
|
|
||||||
## Helm Autupdate
|
## Helm Auto-update
|
||||||
|
|
||||||
* Currently you can just use the normal image autoupdate machanism
|
* Currently, you can just use the normal image auto-update mechanism
|
||||||
* Requirement: The helm chart is stored as a OCI-Artifact
|
* Requirement: The helm chart is stored as an OCI-Artifact
|
||||||
* How: Just create the usual CRs and annotations
|
* How: Just create the usual CRs and annotations
|
||||||
* They are also working on generalizing the autoupdate Process to fitt all OCI articacts (comming soon)
|
* They are also working on generalizing the auto-update Process to fit all OCI artifacts (coming soon)
|
||||||
|
|
|
@ -2,7 +2,7 @@
|
||||||
title: Check this out
|
title: Check this out
|
||||||
---
|
---
|
||||||
|
|
||||||
Just a loose list of stuff that souded interesting
|
Just a loose list of stuff that sounded interesting
|
||||||
|
|
||||||
* Dapr
|
* Dapr
|
||||||
* etcd backups
|
* etcd backups
|
||||||
|
|
|
@ -4,4 +4,4 @@ title: Lessons Learned
|
||||||
weight: 99
|
weight: 99
|
||||||
---
|
---
|
||||||
|
|
||||||
Interesting lessons learned + tipps/tricks.
|
Interesting lessons learned + tips/tricks.
|
||||||
|
|
Loading…
Reference in New Issue
Block a user