Fixing my Homelab Networking Mess

Imagine this: it's Valentine's Day and instead of having anything special on the schedule, you decide "let's fix this one issue on my homelab which has been annoying everyone since forever!". Well, you're reading my blog, so you possibly don't even have to imagine it. This blogpost isn't as much about the fix but rather a description of the mess I made in my infancy of doing Kubernetes.

My Problem

2 years back (so a long time ago) I had just started with homelabbing and with Kubernetes. I had just hosted my ghost website on my K3s cluster and was facing the challenge of getting Certmanager working with Traefik. After a couple days of no TLS, I gave up. My friend, Bruno, was telling me about Caddy at the time, with easy integration with Let's Encrypt and easy configuration. It was tempting so I pivoted and deployed a Caddy pod to my cluster, gave it a VIP, and port forwarded my port 80 and 443 to it. The configuration was simple, looking like this:

alexbissessur.dev {
  tls mymain@gmail.com
  reverse_proxy ghost.ghost.svc.cluster.local:80
}

Little me had not exactly planned for scaling; after all, it was my first time ever doing any kind of hosting, and I didn't have grand plans for my cluster. Of course, this has changed since and the caddyfile is easily 100 lines long and I have to edit it every time I deploy something new.

And now, since I have only 1 public IP, it was basically reserved for Caddy. So when I needed to use Traefik for a project I'm working on, I had to find an insane workaround for it to work.

I basically rented another public IP in the form of a VPS, and did some silly levels of port forwarding to have public IPs for both Traefik and Caddy. And it is well known that adding complexities like this makes troubleshooting issues much harder. I had to ask a friend for help with this because Traefik's embedded "certmanager" was returning weird errors with ACME haha.

Another complication was that I split IPv4 and IPv6 on the VPS, with the v4 going to Traefik but with the v6 going to Caddy, so I would have IPv6 on this website (no IPv6 IPs on residential plans). Phew.

Fixing The Mess

2 years later and I'm still incapable of getting Certmanager to work with Traefik, which is really saying a lot about how I've progressed in 2 years :P. Because of this, I kept using Traefik's own cert issuer. I started by shifting the DNS entries on Servfail to point to the VPS rather than my home IP. Then I translated each caddy entry into an ingressroute, which is a Traefik CRD.
The ingressroute for this website, for example, looks like:

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: ghostroute
spec:
  entryPoints:
  - websecure
  - web
  routes:
  - kind: Rule
    match: Host(`alexbissessur.dev`)
    services:
    - name: ghost-svc
      port: 80
  tls:
    certResolver: le

After crossing some 20 (sub)domains off my list, I have everything running with both IPv4 and IPv6 and passing through Traefik. The next thing is to get rid of the port forwarding on 1080/1443 and use 80/443 like a normal person since I no longer need to split traffic to Traefik and Caddy.

Which is good, I have finally moved to ingresses as the CNCF is pushing the deprecation of them in favour of gateway APIs.
Well, I'll leave that migration for some other day :)