Stormlight: My Intel NUC Kubernetes Cluster

When I began building a kubernetes cluster in December 2019, I didn't have a great plan. I wanted to program with go, I wanted to learn kubernetes, and I definitely wanted Intel NUCs. As I researched technical decisions after technical decisions, I finally came to a list of desires for the cluster. First, I had to have a name. I named the cluster Stormlight.

Second, I crafted user stories I wanted for myself.

  1. I want to access my cluster services at the domain .stormlight.home so that I don't have to remember IP addresses and port numbers.
  2. I want a simple (to me) deployment system that didn't require touching DNS configuration, storage configuration, and TLS certificate configuration for each service I deploy. If I did have to configure these components for a service, the settings should live within the kubernetes manifests.
  3. When I open cluster HTTP services with Chrome, I wanted a locked icon in the URL bar. I don't want to see the "Your connection is not private" warning and then click the "Proceed to ..." link. These annoy me.

With these out of the way, let's dive into the components that make up the Stormlight.

Physical Hardware

Obviously, I'm using Intel NUCs. But, there are a few other devices on the network that help fulfill my needs. Here are the machines, their names, and some specs.

  • lightweaver
    • Kubernetes master
    • Intel NUC 8i3BEK M.2 SSD
    • 32GB RAM
    • 250GB SSD
  • skybreaker
    • Kubernetes worker
    • Intel NUC 8i3BEK M.2 SSD
    • 32GB RAM
    • 250GB SSD
  • windrunner
    • Kubernetes worker
    • Intel NUC 8i3BEK M.2 SSD
    • 64GB RAM (I got lucky here! Amazon shipped me 64GB instead of the originally purchased 32GB!)
    • 250GB SSD
  • fatty
    • Synology DS413j
    • 8TB storage
    • Some pathetic amount of CPU and RAM. This thing is old, slow, and still works. I can't really complain.
  • TRENDnet 8-Port Gigabit GREENnet Switch
  • Netgear Nighthawk Wifi Router

My home's network diagram looks like this.

Network Diagram
Network Diagram

For all machines, I configure a static IP address on the wifi router. For NUCs, I assign the IP address when the computer is installing the operating system. There's likely a more simple way to do this, but this worked, and I only had to do it three times.

Software

  • Ubuntu 18.04 Server
  • DNS Server
    • Runs on fatty using Synology's DNS server package
    • Hosts the private zone stormlight.home
    • All machines have hostnames defined to make SSH easier
    • *.stormlight.home record points to lightweaver (more on this later)
  • NFS Server
  • Certificate Authority (self-hosted) for SSL certificate signing
    • Root CA for creating intermediate CAs
    • Intermediate CA for signing server certs
    • Both Root and Intermediate certs are installed on my laptop and all cluster machines
  • HAProxy
    • All traffic to the Stormlight is directed here (via *.stormlight.home DNS above)
    • Runs on lightweaver
    • Uses a wildcard cert for *.stormlight.home giving me a nice 🔒 icon in Chrome
    • Terminates SSL traffic
    • Proxies traffic to the local kubernetes ingress (see kubernetes configuration below)
  • Kubernetes v1.17
    • Installed with kubeadm
    • Uses a single host as the master (lightweaver)
    • Uses kubernetes self-signed certs (I built the CA after I set up kubernetes, so I didn't use my own CA at the time).

stormlight.home Domain

Kubernetes relies on load balancers in the cloud or on-premise to handle ingress traffic. For Stormlight, I could deploy services using NodePorts, and I would be able to access the service at <ip address of any NUC>:<NodePort>. But I find this inelegant. I want a domain name for Stormlight.

Therefore, I looked at using DNS. Initially, I wanted to use a public top-level domain. But this costs money, and I'm cheap. So I decided on the stormlight.home private domain. It's not entirely clear to me that the .home TLD is suitable for private use, but I'm comfortable dealing with this in the future.

stormlight.home is served by the DNS server running on fatty. My home router is configured to request records with fatty first before hopping out to 1.1.1.1. Therefore, stormlight.home is available while I'm connected to my home network.

stormlight.home has a handful of configured DNS records. Every machine Stormlight has an entry. This keeps me from saving IP addresses in my SSH config to connect to my NUC machines. Aside from machine records, the DNS server has a wildcard record handling all other subdomains. This is how I send traffic to Stormlight.

Ingress Traffic to Stormlight

Running on lightweaver's port 443 is HAProxy enabling traffic into Stormlight from *.stormlight.home subdomains. HAProxy terminates SSL (user story #2 and #3) and forwards traffic, locally, to the nginx-ingress NodePort service running in kubernetes.

Tracing an HTTPS Request

Let's move our attention to a simplified HTTPS request for the fictional service mysvc. Below is a diagram of an HTTP request into Stormlight. I simplified kubernetes to make the diagram easier to comprehend. Because in reality, I remove the taints on the master so that the kubernetes scheduler runs on all NUCs. So theoretically, traffic could stay entirely on lightweaver if there were mysvc pods running there.

HTTPS Request Tracing
  1. A user request https://mysvc.stormlight.home. This resolves to lightweaver's IP address because I have the wildcard DNS *.stormlight.home record on my Synology NAS.
  2. The request routes to lightweaver's HAProxy.
  3. HAProxy terminates the SSL request.
  4. HAProxy forwards the HTTP request to the local kubernetes cluster's nginx ingress port.
  5. nginx ingress forwards to the mysvc kubernetes service.
  6. mysvc processes the request by forwarding to whatever deployment/replica/pods are running within the cluster.
  7. mysvc sends the response back to the nginx ingress.
  8. nginx ingress responds to HAProxy.
  9. HAProxy responds to the user.
  10. Hopefully, the user is happy. The user is me. I am happy.

Kubernetes

Let's move our attention to Kubernetes. Aside from nginx-ingress, there are a few other services.

Here is the complete list of services on Stormlight.

nginx-ingress

Stormlight uses nginx-ingress to route all HTTP traffic into the cluster. This really helps me with user story #2. I can configure the subdomain/path of a service running in stormlight simply by creating an Ingress resource. I don't have to configure anything in fatty's DNS server or the HAProxy. I really dig this setup.

For example, here's a basic configuration for kuard (a handy debugging application ) so that uses https://kuard.stormlight.home as the domain.

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: ingress-kuard
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  # I can set this to whatever I want
  - host: kuard.stormlight.home
    http:
      paths:
      - path: /
        backend:
          serviceName: kuard
          servicePort: 80

nfs-client

A part of user story #2 deals with storage. While I could configure services to use local storage, I wanted to use fatty as well. Data on fatty has better durability (four 2TB drives) and configured for cloud backups (I did this a long time ago). I want to use local storage for performance, but anything important would be stored on fatty.

I dug into Kubernetes docs on storage options and ran around in circles. Should I use Volumes, Persistent Volumes, or Container Storage Interface plugins? After several days of reading, I landed on nfs-client from the external-storage github repo. To add to my initial confusion, external-storage states that the repository is deprecated and that I should use sig-storage-lib-external-provisioner instead. But, on sig-storage-lib-external-provisioner page, it links back to external-storage for examples. Sigh. Luckily, nfs-client worked well and was easy to set up.

Here's how I configured nfs-client.

nfs-client creates a dynamic provisioner for fatty's NFS shares. With the provisioner, I expose two types of storage classes. The first one, called fatty-archives, archives the data when a PersistentVolumeClaim (PVC) is deleted. Therefore, I don't have to worry about losing data when I muck around with the cluster and accidentally delete PVCs.

The second storage class, called fatty, deletes data when a PVC is removed. Honestly, I don't have much use for the fatty storage class yet, but it enables scaling pods across nodes and use the NFS mount for shared data.

Here's the configuration I use for my docker registry setup.

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: registry-data
  annotations:
    volume.beta.kubernetes.io/storage-class: "fatty-archives"
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100G
---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: registry
spec:
  selector:
    matchLabels:
      app: registry
  replicas: 1
  template:
    metadata:
      labels:
        app: registry
    spec:
      containers:
        - name: registry
          image: registry
          imagePullPolicy: Always
          ports:
            - containerPort: 5000
          volumeMounts:
            - name: registry-data
              mountPath: /var/lib/registry
      volumes:
        - name: registry-data
          persistentVolumeClaim:
            claimName: registry-data

If you look carefully, you don't see fatty's host or mounts shares in this configuration. The NFS configuration is managed in a single place -- the nfs-client manifests. From a service perspective, there's no dependency on NFS. The service depends on a PVC (i.e., a storage request) and the volume configuration for the Deployment. In the future, if I decided to replace fatty with a new NAS (I'm due for an upgrade), then I reconfiguring a new storage class, moving data around from the old NAS to the new NAS, and then changing the PVCs for all my services. I don't have to find all the places where the NFS information is configured in service manifests. Lovely!

Docker Registry

One of the biggest reasons I wanted durable network storage was to run a docker registry. Remember, I'm cheap. So paying for a registry was not something I wanted to do. I also didn't want to store the data on a single node. If I wanted to scale up the number of pods running the registry, I'd like to do so without worry about where the data lives.

So Stormlight runs Docker's Registry at registry.stormlight.home. The images are stored on fatty through the nfs-client provisioner.

Master Component Backups

I worry about failures. I work with cloud providers at my day job, so failures are common and expected. At home, my computers fail far less than cloud instances (case in point: fatty). But, they will fail. And that makes me nervous. So, I made some contingency plans.

Kubernetes depends on the etcd as the backing database. In a single-master configuration, there's one etcd instance. Failure of etcd renders the master services unusable. Services running on the cluster continue to run, but if a service pod dies, the master is unable to provision a replacement. Therefore, backing up etcd is useful.

I found this nice post on backing up the master. I tweaked the script so that it also backs up kubernetes' self-signed certs. This is backup runs every hour and stores data in fatty. The restoration process is scripted with ansible and very similar to what's described in the linked blog post.

This plan leaves me a little less nervous, but it doesn't consider a full lightweaver failure. With a full failure of lightweaver, the entire cluster unusable. All *.stormlight.home subdomains will fail because the machine is offline.

Even writing this makes me nervous, but I think I'll deal with the failure in the future. I've built Stormlight using code. There are no manual steps. So, if I lose the cluster, I can rebuild by relying on my code. I'll open source my code in a future post.

That said, I might buy a few Raspberry Pis to build a multi-master cluster later. That could be another fun side project.

The Results: My Deploy Process

With all the above in place, I can begin programming my own services (about damn time). To deploy, all I need is a single manifest file with a Deployment, a Service, and an Ingress. Here's what kuard's manifest file looks like:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kuard
spec:
  selector:
    matchLabels:
      app: kuard
  replicas: 1
  template:
    metadata:
      labels:
        app: kuard
    spec:
      containers:
      - image: gcr.io/kuar-demo/kuard-amd64:1
        imagePullPolicy: Always
        name: kuard
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: kuard
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
  selector:
    app: kuard

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: ingress-kuard
  annotations:
    # use the shared ingress-nginx
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: kuard.stormlight.home
    http:
      paths:
      - path: /
        backend:
          serviceName: kuard
          servicePort: 80

With the above file, I run kubectl apply -f kuard.yml and then visit https://kuard.stormlight.home. That's my whole deploy process.

And if I need durable storage? I can use the fatty and fatty-archives storage class, or hook into a NUC's local disk.

Onwards

That's Stormlight! I'll share my code with future posts. I need to spend some time cleaning up the code to make it a little easier to use first.