It's Always DNS

I bought an Ikea shelf (Bekant) to house the stormlight NUCs and my Synology NAS. I shut down the entire cluster so that I could move the machines into the shelf. I built the shelf. I placed all the machines into the shelf and meticulously looped cables behind the shelf and plugged them into a powered-off surge protector. I hit the power switch on the surge protector and booted everything.

Since the NUCs were built with NVMe disks, they boot up instantly. Fatty was unfortunately behind.

I checked the kubernetes dashboard for the cluster and noticed that the docker registry and the nfs-client-provisioner services were down. They could not reach fatty (Synology NAS). Makes sense, fatty is old and have four spindle drives it needs to validate.

Once fatty was available, I restarted the pods in stormlight that were broken with no luck. The pods were down.

I looked into the error and realized they were failing to connect to https://fatty.stormlight.home.

Mutha F'er. It's always F'n DNS.

This is what I get for making my router's primary DNS server fatty.

When fatty was offline, my router fell back to the secondary resolver. When the NUCs came online and started querying fatty.stormlight.home, they cached the secondary resolver's response of NXDOMAIN. And that's my problem.

The fix? Flush the DNS cache and restart.

I flushed dns (with ansible: ansible -i hosts all --become -a 'systemd-resolve --flush-caches') and then verified the NUCs were able to resolve fatty's DNS. Afterwards, I checked the kubernetes dashboard and everything came back online.

Anyways, here's what my corner looks like now: