2024-03-19 13:59:45 +00:00
|
|
|
---
|
2024-03-19 15:53:59 +00:00
|
|
|
title: "From Zero to Hero: Scaling Postgres in Kubernetes Using the Power of CloudNativePG"
|
|
|
|
weight: 8
|
2024-03-25 12:45:10 +00:00
|
|
|
tags:
|
|
|
|
- platform
|
|
|
|
- operators
|
|
|
|
- db
|
2024-03-19 13:59:45 +00:00
|
|
|
---
|
|
|
|
|
2024-03-26 13:39:44 +00:00
|
|
|
A short Talk as Part of the Data on Kubernetes day - presented by the VP of Cloud Native at EDB (one of the biggest PG contributors)
|
2024-03-19 13:59:45 +00:00
|
|
|
Stated target: Make the world your single point of failure
|
|
|
|
|
|
|
|
## Proposal
|
|
|
|
|
2024-03-26 13:39:44 +00:00
|
|
|
* Get rid of Vendor-Lockin using the OSS projects PG, K8S and CnPG
|
2024-03-19 13:59:45 +00:00
|
|
|
* PG was the DB of the year 2023 and a bunch of other times in the past
|
|
|
|
* CnPG is a Level 5 mature operator
|
|
|
|
|
|
|
|
## 4 Pillars
|
|
|
|
|
2024-03-26 13:39:44 +00:00
|
|
|
* Seamless Kube API Integration (Operator Pattern)
|
2024-03-19 13:59:45 +00:00
|
|
|
* Advanced observability (Prometheus Exporter, JSON logging)
|
|
|
|
* Declarative Config (Deploy, Scale, Maintain)
|
2024-03-26 13:39:44 +00:00
|
|
|
* Secure by default (Robust containers, mTLS, and so on)
|
2024-03-19 13:59:45 +00:00
|
|
|
|
|
|
|
## Clusters
|
|
|
|
|
2024-03-26 13:39:44 +00:00
|
|
|
* Basic Resource that defines name, instances, sync and storage (and other parameters that have same defaults)
|
2024-03-19 13:59:45 +00:00
|
|
|
* Implementation: Operator creates:
|
|
|
|
* The volumes (PG_Data, WAL (Write ahead log)
|
|
|
|
* Primary and Read-Write Service
|
|
|
|
* Replicas
|
|
|
|
* Read-Only Service (points at replicas)
|
|
|
|
* Failover:
|
|
|
|
* Failure detected
|
|
|
|
* Stop R/W Service
|
|
|
|
* Promote Replica
|
2024-03-26 13:39:44 +00:00
|
|
|
* Activate R/W Service
|
|
|
|
* Kill old primary and demote to replica
|
2024-03-19 13:59:45 +00:00
|
|
|
|
|
|
|
## Backup/Recovery
|
|
|
|
|
2024-03-26 13:39:44 +00:00
|
|
|
* Continuous Backup: Write Ahead Log Backup to object store
|
2024-03-19 13:59:45 +00:00
|
|
|
* Physical: Create from primary or standby to object store or kube volumes
|
2024-03-26 13:39:44 +00:00
|
|
|
* Recovery: Copy full backup and apply WAL until target (last transaction or specific timestamp) is reached
|
|
|
|
* Replica Cluster: Basically recreates a new cluster to a full recovery but keeps the cluster in Read-Only Replica Mode
|
2024-03-19 13:59:45 +00:00
|
|
|
* Planned: Backup Plugin Interface
|
|
|
|
|
|
|
|
## Multi-Cluster
|
|
|
|
|
|
|
|
* Just create a replica cluster via WAL-files from S3 on another kube cluster (lags 5 mins behind)
|
|
|
|
* You can also activate replication streaming
|
|
|
|
|
2024-03-26 13:39:44 +00:00
|
|
|
## Recommended architecture
|
2024-03-19 13:59:45 +00:00
|
|
|
|
2024-03-26 13:39:44 +00:00
|
|
|
* Dev Cluster: 1 Instance without PDB and with Continuous backup
|
|
|
|
* Prod: 3 Nodes with automatic failover and continuous backups
|
2024-03-19 13:59:45 +00:00
|
|
|
* Symmetric: Two clusters
|
|
|
|
* Primary: 3-Node Cluster
|
2024-03-26 13:39:44 +00:00
|
|
|
* Secondary: WAL based 3-Node Cluster with a designated primary (to take over if primary cluster fails)
|
|
|
|
* Symmetric Streaming: Same as Secondary, but you manually enable the streaming API for live replication
|
2024-03-19 13:59:45 +00:00
|
|
|
* Cascading Replication: Scale Symmetric to more clusters
|
2024-03-26 13:39:44 +00:00
|
|
|
* Single availability zone: Well, do your best to spread to nodes and aspire to stretched Kubernetes to more AZs
|
2024-03-19 13:59:45 +00:00
|
|
|
|
|
|
|
## Roadmap
|
|
|
|
|
|
|
|
* Replica Cluster (Symmetric) Switchover
|
|
|
|
* Synchronous Symmetric
|
2024-03-26 13:39:44 +00:00
|
|
|
* 3rd Party Plugins
|
2024-03-19 13:59:45 +00:00
|
|
|
* Manage DBs via the Operator
|
|
|
|
* Storage Autoscaling
|