The Problem: Six Interfaces for One Question # “Is anything broken in my homelab?”
Answering that question used to mean: SSH into Proxmox to check guest status. Curl the Pi-hole API for DNS health. Open Grafana to scan Prometheus alerts. Check Graylog for error spikes. Look at Semaphore for failed automation runs. Glance at Caddy logs for 502s.
The 2 AM Wake-Up Call # I woke up to find my CI/CD platform had been down for 8 hours. Semaphore, the Ansible automation engine that manages my entire homelab, was stuck in a crash loop:
1 2 3 /usr/local/bin/server-wrapper: line 295: syntax error: unexpected "&&" /usr/local/bin/server-wrapper: line 295: syntax error: unexpected "&&" /usr/local/bin/server-wrapper: line 295: syntax error: unexpected "&&" The same error, repeating every few seconds. The container would start, hit the broken entrypoint script, crash, and restart. Endlessly.
The Problem # My Caddy reverse proxy runs as an HA pair – two nodes behind a keepalived VIP. Every service in the homelab gets its traffic through this pair. The setup works great, except for one recurring failure mode: config drift.
The deployment process was manual: edit the Caddy site config in git, SCP it to both nodes, validate, reload. The “both nodes” part is where things break down. It’s easy to deploy to caddy1, test it, see it working, and then forget caddy2 exists. Until keepalived fails over and suddenly half your sites return 502s because the backup node has last week’s config.
The Problem # My PAN-OS firewall (GlobalProtect VPN portal at vpn.mareoxlan.com) needs a valid TLS certificate. I had a dedicated LXC (30122) running acme.sh with a Cloudflare DNS-01 challenge to issue a wildcard cert, then a PAN-OS deploy hook to push it to the firewall via the XML API. It worked, but it was a single-purpose VM doing the same job my Caddy reverse proxy already does – Caddy auto-renews *.mareoxlan.com via the same Cloudflare DNS-01 mechanism.
Overview # If you’re running SSL decryption on a Palo Alto firewall, you’ve probably hit this: a user reports they can’t access a website, and it turns out the site’s CA certificate isn’t in your firewall’s trusted root store. PAN-OS only updates its built-in root store on major software releases, which means between upgrades your firewall’s trust anchors slowly go stale.
Overview # After multiple outages caused by configuration drift between two HA Caddy reverse proxy nodes, I built a GitOps pipeline that automatically deploys configs to both nodes whenever changes are pushed to the main branch. Config drift is now impossible by design.
The problem: Two Caddy nodes in a keepalived HA pair need identical configs. Forgetting to deploy to the second node after editing a site config caused service outages — twice in the same week.
What Changed # Added Ansible playbooks to Semaphore for automated Proxmox cluster power management:
Night Sleep: Gracefully shuts down non-essential VMs/LXCs at night Day On: Wakes up the cluster in the morning Scheduled via Semaphore cron Why # Running all VMs 24/7 wastes power when they’re not needed. Automated scheduling reduces energy costs and wear on hardware.
What Changed # Added three Ansible playbooks to Semaphore for managing Caddy reverse proxy domains:
Add Domain: Creates new reverse proxy entry on both HA nodes Remove Domain: Removes domain from both nodes List Domains: Shows all configured domains across the cluster Why # Manual Caddy config edits were error-prone and required SSH to both nodes. Semaphore templates provide a UI for common operations with built-in validation.