Fun With NodeLocal DNSCache in Kubernetes Clusters

Martin Beranek
3 min readNov 5, 2023
DALL·E NodeLocal DNSCache in Kubernetes Clusters

This is going to be a short one. Have you ever tried to use DNS for load balancing? I am not talking the fancy weighted or latency-based setup, only setting multiple records for the same domain name. This can, in some cases, work. DNS resolver gives you the records for the domain name, and it does not guarantee that the order of the records will be the same on every request. Hence the connections can start randomly on any IP address under the same domain.

Let’s suppose we have a rabbitmq service with multiple nodes. Each node was reachable via the same domain name rabbit-cluster.example.com. Host command returned:

$ host rabbit-cluster.example.com
rabbit-cluster.example.com has address 10.101.129.91
rabbit-cluster.example.com has address 10.101.130.133
rabbit-cluster.example.com has address 10.101.128.113

Once you executed the command multiple times, the results should have been differently ordered.

NodeLocal DNSCache Improvement

Our applications are written in PHP. Due to different configurations and behavior of the PHP itself, DNS queries is not cached at the application. Every query has to go to the resolver and back. We felt that using DNSCache improves the overall performance and stability of our applications. Since it was a few years ago when we were setting up rabbitmq cluster, issues with multiple IP addresses for the same domain name haven’t come up to our minds.

Can you guess when we noticed? Yes, when the third node of the rabbitmq cluster started to fail. Only then did we check the connection count and find out the inequality. We used this simple script (thanks, ChatGPT, for help) to get the histogram of the order:

#!/bin/bash

declare -A results
declare -A hostnames

# Define hostname translations
hostnames["10.101.129.91"]="rabbit-b.example.com"
hostnames["10.101.130.133"]="rabbit-c.example.com"
hostnames["10.101.128.113"]="rabbit-d.example.com"

# Initialize the results array
for i in {1..3}; do
for ip in "10.101.129.91" "10.101.130.133" "10.101.128.113"; do
results["$ip-$i"]=0
done
done

# Loop to query the DNS
for i in {1..1000}; do
output=($(host rabbitmq-cluster.example.com | grep "has address" | awk '{print $NF}'))
for idx in "${!output[@]}"; do
let idx++
let results["${output[$idx-1]}-$idx"]++
done
done

# Print the results
for ip in "10.101.129.91" "10.101.130.133" "10.101.128.113"; do
for i in {1..3}; do
echo "${hostnames[$ip]} ($ip) was $i. for ${results["$ip-$i"]} times"
done
done

Running the script from a random POD in K8s cluster showed the following numbers:

rabbit-b.example.com (10.101.129.91) was 1. for 0 times
rabbit-b.example.com (10.101.129.91) was 2. for 0 times
rabbit-b.example.com (10.101.129.91) was 3. for 1000 times
rabbit-c.example.com (10.101.130.133) was 1. for 0 times
rabbit-c.example.com (10.101.130.133) was 2. for 1000 times
rabbit-c.example.com (10.101.130.133) was 3. for 0 times
rabbit-d.example.com (10.101.128.113) was 1. for 1000 times
rabbit-d.example.com (10.101.128.113) was 2. for 0 times
rabbit-d.example.com (10.101.128.113) was 3. for 0 times

Poor little node D. Every query we did during the testing ended up with rabbit-d on the first place. Results were much different if the test was executed from any instance outside of Kubernetes:

rabbit-b.example.com (10.101.129.91) was 1. for 283 times
rabbit-b.example.com (10.101.129.91) was 2. for 367 times
rabbit-b.example.com (10.101.129.91) was 3. for 350 times
rabbit-c.example.com (10.101.130.133) was 1. for 306 times
rabbit-c.example.com (10.101.130.133) was 2. for 361 times
rabbit-c.example.com (10.101.130.133) was 3. for 333 times
rabbit-d.example.com (10.101.128.113) was 1. for 411 times
rabbit-d.example.com (10.101.128.113) was 2. for 272 times
rabbit-d.example.com (10.101.128.113) was 3. for 317 times

How To Fix This?

First, do not turn DNSCache off! Not even changing the default configuration should be done. This issue should be fixed by introducing a proper network load balancer. Once we migrated the DNS record to the load balancer, the connection count to node D started to fall down.

This should be the only result once you query DNS for rabbit-cluster.example.com:

$ host rabbit-cluster.example.com
rabbit-cluster.shipmonk.cloud is an alias for rabbit-nlb-2154e8ee40bbab24.elb.us-east-1.amazonaws.com.
rabbit-nlb-2154e8ee40bbab24.elb.us-east-1.amazonaws.com has address 10.101.130.22
rabbit-nlb-2154e8ee40bbab24.elb.us-east-1.amazonaws.com has address 10.101.129.109
rabbit-nlb-2154e8ee40bbab24.elb.us-east-1.amazonaws.com has address 10.101.128.156

And that’s it for today. Hopefully, it helps you debug your possible issues.

--

--

Martin Beranek

I am an Infra Team Lead at Shipmonk. My interest is Terraform mainly in GCP. I am also enthusiastic about backend and related topics: Golang, Typescript, ...