kubecon24/content/day3/01_stop_leaing_dns.md
2024-03-21 14:27:24 +01:00

82 lines
2.5 KiB
Markdown

---
title: Stop leaking Kubernetes service information via DNS
weight: 1
---
A talk by Google and Ivanti.
## Background
* RBAC is ther to limit information access and control
* RBAC can be used to avoid interfearance in shared envs
* DNS is not really applicable when it comes to RBAC
### DNS in Kubernetes
* DNS Info is always public -> No auth
* Services are exposed to all clients
## Isolation and Clusters
### Just don't share
* Specially for smaller, high growth companies with infinite VC money
* Just give everyone their own cluster -> Problem solved
* Smaller (<1000) typicly use many small clusters
### Shared Clusters
* Becomes imporetant when cost is a question and engineers don't have any platform knowledge
* A dedicated kube team can optimize both hardware and deliver updates fast -> Increased productivity by utilizing specialists
* Problem: Noisy neighbors by leaky DNS
## Leaks (demo)
### Base scenario
* Cluster with a bunch of deployments and services
* Creating a simple pod results in binding to default RBAC -> No access to anything
* Querying DNS info (aka services) still leaks everything (namespaces, services)
### Leak mechanics
* Leaks are based on the `<service>.<nemspace>.<svc>.cluster.local` pattern
* You can also just reverse looku the entire service CIDR
* SRV records get created for each service including the service ports
## Fix the leak
### CoreDNS Firewall Plugin
* External plugin provided by the coredns team
* Expression engine built-in with support for external policy engines
```mermaid
flowchart LR
req-->metadata
metadata-->firewall
firewall-->kube
kube-->|Adds namespace/clientnamespace metadata|firewall
firewall-->|send nxdomain|metadata
metadata-->res
```
### Demo
* Firwall rule that only allows queries from the same namespace, kube-system or default
* Every other cross-namespace request gets blocked
* Same SVC requests from before now return NXDOMAIN
### Why is this a plugin and not default?
* Requires `pods verified` mode -> Puts the watch on pods and only returns a query result if the pod actually exists
* Puts a watch on all pods -> higher API load and coredns mem usage
* Potential race conditions with initial lookups in larger clusters -> Alternative is to fail open (not really secure)
### Per tenant DNS
* Just run a cporedns instance for each tenant
* Use a mutating webhook to inject the right dns into each pod
* Pro: No more pods verified -> Aka no more constant watch
* Limitation: Platform services still need a central coredns