I was paying $25/month across three monitoring services for the same thing: knowing when my homelab services go down. Better Uptime ($5), UptimeRobot ($8), and Grafana Cloud ($12 for metrics retention).
Last month I replaced all three with a single Docker Compose stack running on the same ThinkCentre it’s monitoring. Three months in, it’s caught 14 outages, alerted me on all of them, and costs exactly $0 extra.
The Stack
Four containers, 512MB RAM total on idle:
services:
uptime-kuma:
image: louislam/uptime-kuma:latest
ports: ["3001:3001"]
volumes: ["./uptime-kuma:/app/data"]
restart: unless-stopped
prometheus:
image: prom/prometheus:latest
ports: ["9090:9090"]
volumes: ["./prometheus.yml:/etc/prometheus/prometheus.yml"]
restart: unless-stopped
grafana:
image: grafana/grafana:latest
ports: ["3000:3000"]
environment:
- GF_INSTALL_PLUGINS=grafana-piechart-panel
volumes: ["./grafana:/var/lib/grafana"]
restart: unless-stopped
node-exporter:
image: prom/node-exporter:latest
ports: ["9100:9100"]
restart: unless-stopped
network_mode: host
What Each Piece Does
**Uptime Kuma** — the replacement for Better Uptime and UptimeRobot. It pings 12 endpoints every 60 seconds: blog, ERP staging, Ollama, Open WebUI, PostgreSQL, Redis, CI runner, and five internal services. If something goes down, it sends notifications via Gotify (self-hosted push notifications) and Telegram.
**Prometheus + Node Exporter** — the Grafana Cloud replacement. Node Exporter scrapes system metrics (CPU, RAM, disk, network) every 15 seconds. Prometheus stores 30 days of data in 8GB of disk space.
**Grafana** — the dashboard. I rebuilt three dashboards from scratch:
1. **Service Overview** — status of all 12 endpoints, uptime percentages, response times
2. **System Health** — CPU/memory/disk trends, top processes, network I/O
3. **Docker Stats** — per-container resource usage, restart counts, image sizes
What I Gave Up
– **SMS alerts.** Better Uptime’s SMS alerting was great for critical outages. Uptime Kuma doesn’t do SMS without Twilio. I use Telegram + Gotify instead, which is free but requires internet.
– **99.99% uptime SLA.** The monitoring stack runs on the same machine it monitors. If the machine dies, the monitor dies too. I solved this with a $3/month VPS running a single ping-only check — it just pings the homelab IP and texts me if it’s unreachable.
– **Beautiful default dashboards.** Grafana Cloud’s dashboards look better out of the box. My self-hosted ones look functional (fine) but I spent 4 hours setting them up.
The Numbers
| Service | Monthly Cost | My Cost | Savings |
|———|————-|———|———|
| Better Uptime | $5 | $0 | $5 |
| UptimeRobot | $8 | $0 | $8 |
| Grafana Cloud | $12 | $0 | $12 |
| $3 VPS (failover) | $0 | $3 | -$3 |
| **Total** | **$25** | **$3** | **$22/month** |
That’s $264/year saved. The setup took 3 hours. Breakeven was at about 4 months — and I crossed it two months ago.
What Broke (And How I Fixed It)
**Break #1: Database file corruption.** Uptime Kuma uses SQLite. After a power outage, the DB was corrupted and lost 3 days of monitoring history. Fix: add a `sqlite3 .backup` cron job that snapshots the DB every 6 hours.
**Break #2: Grafana forgot all my dashboards.** I updated Grafana from v10 to v11 and the dashboard JSON format changed. Dashboards stayed in the SQLite DB but Grafana couldn’t parse them. Fix: always pin the Grafana version (`image: grafana/grafana:10.4.0`) instead of using `latest`.
**Break #3: Prometheus disk filled up.** After 60 days, Prometheus had consumed 22GB. The `retention` setting was defaulting to 15 days when I thought it was 30. Fix: explicit config:
global:
scrape_interval: 15s
evaluation_interval: 15s
storage:
tsdb:
retention:
time: 30d
size: 10GB
The Verdict
Self-hosting monitoring saves money and teaches you way more about your infrastructure than paying for it ever did. I know exactly how much RAM Ollama uses at idle (2.4GB), how long PostgreSQL recovery takes (37 seconds), and which container restarts most often (WordPress, every 3-4 days due to PHP-FPM memory leaks).
But the biggest win: when something goes down at 2 AM, the alert goes to my phone via Telegram in under 60 seconds. That’s the same SLA I was paying $25/month for — with zero vendor lock-in and a lot more visibility.
*The Prometheus config and full Compose file are in my homelab repo if you want to replicate the stack. What monitoring tools are you paying for that you could self-host?*
Discover more from Susiloharjo
Subscribe to get the latest posts sent to your email.