final talks of day4
This commit is contained in:
parent
c2326acfea
commit
ef360b2e89
|
@ -0,0 +1,82 @@
|
|||
---
|
||||
title: "TikTok’s Edge Symphony: Scaling Beyond Boundaries with Multi-Cluster Controllers"
|
||||
weight: 6
|
||||
---
|
||||
|
||||
A talk by TikTok/ByteDace (duh) focussed on using central controllers instead of on the edge.
|
||||
|
||||
## Background
|
||||
|
||||
> Global means non-china
|
||||
|
||||
* Edge platform team for cdn, livestreaming, uploads, realtime communication, etc.
|
||||
* Around 250 cluster with 10-600 nodes each - mostly non-cloud aka baremetal
|
||||
* Architecture: Control plane clusters (platform services) - data plane clusters (workload by other teams)
|
||||
* Platform includes logs, metrics, configs, secrets, ...
|
||||
|
||||
## Challenges
|
||||
|
||||
### Operators
|
||||
|
||||
* Operators are essential for platform features
|
||||
* As the feature requests increase, more operators are needed
|
||||
* The deployment of operators throughout many clusters is complex (namespace, deployments, pollicies, ...)
|
||||
|
||||
### Edge
|
||||
|
||||
* Limited ressources
|
||||
* Cost implication of platfor features
|
||||
* Real time processing demands by platform features
|
||||
* Balancing act between ressorces used by workload vs platform features (20-25%)
|
||||
|
||||
### The classic flow
|
||||
|
||||
1. New feature get's requested
|
||||
2. Use kube-buiders with the sdk to create the operator
|
||||
3. Create namespaces and configs in all clusters
|
||||
4. Deploy operator to all clsuters
|
||||
|
||||
## Possible Solution
|
||||
|
||||
### Centralized Control Plane
|
||||
|
||||
* Problem: The controller implementation is limited to a cluster boundry
|
||||
* Idea: Why not create a signle operator that can manage multiple edge clusters
|
||||
* Implementation: Just modify kubebuilder to accept multiple clients (and caches)
|
||||
* Result: It works -> Simpler deployment and troubleshooting
|
||||
* Concerns: High code complexity -> Long familiarization
|
||||
* Balance between "simple central operator" and operator-complexity is hard
|
||||
|
||||
### Attempt it a bit more like kubebuilder
|
||||
|
||||
* Each cluster has its own manager
|
||||
* There is a central multimanager that starts all of the cluster specific manager
|
||||
* Controller registration to the manager now handles cluster names
|
||||
* The reconciler knows which cluster it is working on
|
||||
* The multi cluster management basicly just tets all of the cluster secrets and create a manager+controller for each cluster secret
|
||||
* Challenges: Network connectifiy
|
||||
* Solutions:
|
||||
* Dynamic add/remove of clusters with go channels to prevent pod restarts
|
||||
* Connectivity health checks -> For loss the recreate manager get's triggered
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
mcm-->m1
|
||||
mcm-->m2
|
||||
mcm-->m3
|
||||
```
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
secrets-->ch(go channels)
|
||||
ch-->|CREATE|create(Create manager + Add controller + Start manager)
|
||||
ch-->|UPDATE|update(Stop manager + Create manager + Add controller + Start manager)
|
||||
ch-->|DELETE|delete(Stop manager)
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
* Acknowlege ressource contrains on edge
|
||||
* Embrace open source adoption instead of build your own
|
||||
* Simplify deployment
|
||||
* Recognize your own optionated approach and it's use cases
|
|
@ -0,0 +1,78 @@
|
|||
---
|
||||
title: "Fluent Bit v3: Unified Layer for Logs, Metrics and Traces"
|
||||
weight: 7
|
||||
---
|
||||
|
||||
The last talk of the conference.
|
||||
Notes may be a bit unstructured due to tired note taker.
|
||||
|
||||
## Background
|
||||
|
||||
* FluentD is already graduated
|
||||
* FluentBit is a daughter-project of FluentD (also graduated)
|
||||
|
||||
## Basics
|
||||
|
||||
* Fluentbit is compatible with
|
||||
* prometheus (It can replace the prometheus scraper and node exporter)
|
||||
* openmetrics
|
||||
* opentelemetry (HTTPS input/output)
|
||||
* FluentBit can export to Prometheus, Splunk, InfluxDB or others
|
||||
* So pretty much it can be used to collect data from a bunch of sources and pipe it out to different backend destinations
|
||||
* Fluent ecosystem: No vendor lock-in to observability
|
||||
|
||||
### Arhitectures
|
||||
|
||||
* The fluent agent collects data and can send it to one or multiple locations
|
||||
* FluentBit can be used for aggregation from other sources
|
||||
|
||||
### In the kubernetes logging ecosystem
|
||||
|
||||
* Pods logs to console -> Streamed stdout/err gets piped to file
|
||||
* The logs in the file get encoded as JSON with metadata (date, channel)
|
||||
* Labels and annotations only live in the control plane -> You have to collect it additionally -> Expensive
|
||||
|
||||
## New stuff
|
||||
|
||||
### Limitations with classic architectures
|
||||
|
||||
* Problem: Multiple filters slow down the main loop
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph main[Main Thread/Event loop]
|
||||
buffer
|
||||
schedule
|
||||
retry
|
||||
fitler1
|
||||
filter2
|
||||
filter3
|
||||
end
|
||||
in-->|pipe in data|main
|
||||
main-->|filter and pipe out|out
|
||||
```
|
||||
|
||||
### Solution
|
||||
|
||||
* Solution: Processor - a seperate thread segmented by telemetry type
|
||||
* Plugins can be written in your favourite language /c, rust, go, ...)
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph in
|
||||
reader
|
||||
streamner1
|
||||
processor2
|
||||
processor3
|
||||
end
|
||||
in-->|pipe in data|main(Main Thread/Event loop)
|
||||
main-->|filter and pipe out|out
|
||||
```
|
||||
|
||||
### General new features in v3
|
||||
|
||||
* Native HTTP/2 support in core
|
||||
* Contetn modifier with multiple operations (insert, upsert, delete, rename, hash, extract, convert)
|
||||
* Metrics selector (include or exclude metrics) with matcher (name, prefix, substring, regex)
|
||||
* SQL processor -> Use SQL expression for selections (instead of filters)
|
||||
* Better OpenTelemetry output
|
|
@ -4,4 +4,8 @@ title: Operators
|
|||
|
||||
## Observability
|
||||
|
||||
* Export reconcile loop steps as opentelemetry traces
|
||||
* Export reconcile loop steps as opentelemetry traces
|
||||
|
||||
## Work queue
|
||||
|
||||
* Go channels as queues
|
||||
|
|
Loading…
Reference in New Issue