Some checks failed
Build latest image / build-container (push) Failing after 50s
81 lines
2.3 KiB
Markdown
81 lines
2.3 KiB
Markdown
---
|
|
title: Day 2000 - Migrating from kubeadm + ansible to clusterapi+talos
|
|
weight: 2
|
|
tags:
|
|
- kubecon
|
|
- platform
|
|
---
|
|
|
|
{{% button href="https://youtu.be/uQ_WN1kuDo0" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}}
|
|
{{% button href="https://static.sched.com/hosted_files/kccnceu2025/fd/day2000-migration-ClusterAPI-talos.pdf" style="tip" icon="person-chalkboard" %}}Slides{{% /button %}}
|
|
|
|
## Background
|
|
|
|
- They use large, shared clusters
|
|
- The oldest cluster is 2099 days (5,8 years) old
|
|
- Onprem hosted on vSphere with vanilla kubeadm
|
|
- Fun fact: They run chaosmonkey on all clusters -> Automaticly prepares for updates
|
|
|
|
### Legacy provisioning
|
|
|
|
1. Terraform create debian vm
|
|
2. Deploy base tools with puppet
|
|
3. Register nodes in inventory yaml file
|
|
4. run ansible playbook -> Renders configs and runs kubeadm
|
|
5. Configure ArgoCD
|
|
|
|
### Target
|
|
|
|
- Use Clusterapi to manage the workload-clusters
|
|
- Basic CRDS: Cluster, MachineDeployment, Machine
|
|
- Talos: Immutable, minimal, ephemeral with declarative config via grpc api
|
|
|
|

|
|
|
|
|
|
## Migration
|
|
|
|
1. Config matching between kubeadm and talos+capi
|
|
2. Import PKI/Certs
|
|
3. Create ClusterAPI CRDs
|
|
4. Add ClusterAPI Nodes
|
|
5. Remove kubeadm nodes
|
|
|
|
### 1. Config matching
|
|
|
|
1. Serviceaccount Issuer: Talos has it's own default
|
|
2. etcd encryption key names are hardcoded in talos
|
|
3. Re-Encrypt all secrets (get secrets, replace secrets)
|
|
|
|
### 2. PKI
|
|
|
|
1. Talos includes some logic that can generate a secrets bundle from an existing API
|
|
2. Import: The etcd, k8s, serviceaccount and os (talos specific, used for the talos api auth) certificates
|
|
|
|
### 3. CRDs
|
|
|
|
- One namespace per workload cluster
|
|
- Cluster-CRD: Ref to CP and Infrastructure
|
|
- ControlPlane-CRD: Create cp MDs
|
|
- Infrastructure: References template for wokrer-MDs
|
|
|
|

|
|
|
|
### 4. Add ClusterAPI Nodes
|
|
|
|
- Add new CP and Worker Nodes to the cluster that are managed by CAPI (slowly, stuff will break)
|
|
- Remove the old nodes one by one over weeks ore months
|
|
- Potential Problems:
|
|
- Mismatched serviceaccountissuer
|
|
- Missing etcd encryption key
|
|
- Wrong etcd encryption key
|
|
- Loss of quorum: `--force-new-cluster` can force recovery on one node of the etcd cluster
|
|
|
|
## Demo
|
|
|
|
I reccomend watching the demo
|
|
Talos seems pretty cool.
|
|
|
|
## Bootstrapping
|
|
|
|
- Kind cluster in github action or on local device |