docs(day1): Added observability platform talk notes

2025-07-21 16:43:35 +02:00 · 2025-07-21 16:43:35 +02:00 · 021ab45ec5
commit 021ab45ec5
parent 0a464e0dfd
1 changed files with 51 additions and 0 deletions
--- a/content/day1/10_observability.md
+++ b/content/day1/10_observability.md
@ -0,0 +1,51 @@
+---
+title: "Think Big: Monitoring Stack was yesterday - Observability Platform at scale!"
+weight: 10
+tags:
+ - monitoring
+ - observability
+---
+
+<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
+<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
+
+## Where do you start with monitoring
+
+- The cloud standard solution: Prometheus
+- But: What if we don't just monitor one app but a cluster or muiltiple clusters?
+- Problem: Prometheus isn't quite the best when it comes to scaling
+- And: We want Dashboards, Traces, Alerting, Logs, Auditing, ...
+
+## Trying to build the master monitoring by just adding stuff on the side
+
+- Add custom stuff
+- More complex setups
+- Less and less documentation and standardization
+
+## But how do we regain controll
+
+- Product Thinking: Let's collect the problems 
+- Result: No clear seperation of the product, no vision (just firefighting), We want better releases and improve resource usage
+
+### Transition
+
+1. Overview of the current stack -> Just list all components -> We're no longer just a monitoring stack, but we do overvability
+2. 
+   1. Long term goals and vision -> Add clear interfaces and contracts (hey platform mindset, we've heard that one before) based on expectations
+   2. Target groups and journeys -> Clear reponsibility cut-off between platfrom<->users
+3. Improve the plattform -> Needs full buy in to be the **central**, **open** and **selfservice** platform
+   - In their case: Focus on Mimir instead of prometheus and alloy but keep grafana and loki
+   - Define everything else as out of scope (for now)
+   - Expand scope by improving the experience instead of just "adding tools"
+
+## Pillars of Observability
+
+- Data management: Ingest, Query
+- Dashboard Management: Create, Update, Export
+- Alert Management: Rules, Routing, Analytics, Silence
+
+## Wrap up
+
+- Do i need monitoring or more (both is fine)?
+- Identify the target audience and their journey (not jsut the tools they want to use)
+- Improve the experience and say no if a user requests something that would not improve it