kubecon24/content/day3/01_stop_leaing_dns.md

---
title: Stop leaking Kubernetes service information via DNS
weight: 1
---

A talk by Google and Ivanti.

## Background

* RBAC is ther to limit information access and control
* RBAC can be used to avoid interfearance in shared envs
* DNS is not really applicable when it comes to RBAC

### DNS in Kubernetes

* DNS Info is always public -> No auth
* Services are exposed to all clients

## Isolation and Clusters

### Just don't share

* Specially for smaller, high growth companies with infinite VC money
* Just give everyone their own cluster -> Problem solved
* Smaller (<1000) typicly use many small clusters

### Shared Clusters

* Becomes imporetant when cost is a question and engineers don't have any platform knowledge
* A dedicated kube team can optimize both hardware and deliver updates fast -> Increased productivity by utilizing specialists
* Problem: Noisy neighbors by leaky DNS

## Leaks (demo)

### Base scenario

* Cluster with a bunch of deployments and services
* Creating a simple pod results in binding to default RBAC -> No access to anything
* Querying DNS info (aka services) still leaks everything (namespaces, services)

### Leak mechanics

* Leaks are based on the `<service>.<nemspace>.<svc>.cluster.local` pattern
* You can also just reverse looku the entire service CIDR
* SRV records get created for each service including the service ports

## Fix the leak

### CoreDNS Firewall Plugin

* External plugin provided by the coredns team
* Expression engine built-in with support for external policy engines

```mermaid
flowchart LR
    req-->metadata
    metadata-->firewall
    firewall-->kube
    kube-->|Adds namespace/clientnamespace metadata|firewall
    firewall-->|send nxdomain|metadata
    metadata-->res
```

### Demo

* Firwall rule that only allows queries from the same namespace, kube-system or default
* Every other cross-namespace request gets blocked
* Same SVC requests from before now return NXDOMAIN

### Why is this a plugin and not default?

* Requires `pods verified` mode -> Puts the watch on pods and only returns a query result if the pod actually exists
* Puts a watch on all pods -> higher API load and coredns mem usage
* Potential race conditions with initial lookups in larger clusters -> Alternative is to fail open (not really secure)

### Per tenant DNS

* Just run a cporedns instance for each tenant
* Use a mutating webhook to inject the right dns into each pod
* Pro: No more pods verified -> Aka no more constant watch
* Limitation: Platform services still need a central coredns