docs(day0): OTEL feedback talk
This commit is contained in:
51
content/day0/07_scalingsatisfaction.md
Normal file
51
content/day0/07_scalingsatisfaction.md
Normal file
@@ -0,0 +1,51 @@
|
||||
---
|
||||
title: "Scaling on satisfaction: Automated Rollouts Driven By User Feedback"
|
||||
weight: 7
|
||||
tags:
|
||||
- platformengineeringday
|
||||
- staging
|
||||
- rollout
|
||||
- feedback
|
||||
- otel
|
||||
---
|
||||
|
||||
<!-- {{% button href="https://youtu.be/rkteV6Mzjfs" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->
|
||||
<!-- {{% button href="https://docs.google.com/presentation/d/1nEK0CVC_yQgIDqwsdh-PRihB6dc9RyT-" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}} -->
|
||||
<!-- {{% button href="https://github.com/thomasvitale/kubecon-2026-gitops" style="info" icon="code" %}}Code/Demo{{% /button %}} -->
|
||||
{{% button href="https://whitneylee.com" style="info" icon="link" %}}Website/Homepage{{% /button %}}
|
||||
<!-- {{% button href="https://thomasvitale.com" style="info" icon="link" %}}Website/Homepage{{% /button %}}
|
||||
|
||||
## What they are actually talking about
|
||||
|
||||
- A way of creating metrics/traces from an llm and anlyzing them
|
||||
- The integration of the user's feedback
|
||||
- Basicly the integration of what the variant did on the server to the vote event to promote based on feedback
|
||||
- Combined with an into to OTEL
|
||||
|
||||
## Baseline
|
||||
|
||||
- Question: How do we know that content generated by llms and delivered to our users is good or bad
|
||||
- Idea: Using OTEL and user feedback to drive canary deployments and rollout
|
||||
- Needed: A standardized vocabulary (so we can talk to any telemetry system)
|
||||
|
||||
## Demo Architecture
|
||||
|
||||
The start of the talk featured an evolving story (5 parts) and let the attendees vote on if they like it or not to emulate rollouts of a new application version with immideate user feedback. It was based on flagger deciding every thrity seconds if the user feedback allows promotion of new versions.
|
||||
|
||||
- Audience get's plit to two variants running as knative deployments
|
||||
- OTEL Collector collects telementry data and the platform (Flagger) uses it as the basis for it's decisions
|
||||
- The collection was done by creating a user session span with a span event (aka a log) regarding the voting -> Span events are deprecated and will be moved to a logs api
|
||||
|
||||
## Now to our platform
|
||||
|
||||
- Stack: Kubernetes on hetzner with components (cert-manager, ingress, knative, ...) packaged by carvel
|
||||
- knative as the deployment target for apps
|
||||
- Flagger as the release decision tools
|
||||
- OTEL for instrumentation
|
||||
- Crossplane
|
||||
- API: StoryApp CRD as the main interface for controlling what we want to deploy
|
||||
|
||||
## Takeaway
|
||||
|
||||
- Include user feedback in the decision process for new rollouts
|
||||
- OTel can be used to automate thes
|
||||
Reference in New Issue
Block a user