How to Monitor Your Self-Hosted Services with Prometheus and Grafana

01Why Monitoring Matters for Self-Hosters

Running services without monitoring is like driving without a dashboard — you have no idea how fast you're going, how much fuel you have left, or whether the engine is about to overheat. I learned this when my Nextcloud instance quietly ran out of disk space over a weekend, corrupting several files before I noticed on Monday morning. Since then, I've run a Prometheus + Grafana monitoring stack alongside every service I deploy. It's caught disk issues, memory leaks, certificate expirations, and performance degradation before they became real problems. The setup takes about an hour and runs on minimal resources. This guide walks you through setting up a complete monitoring stack with Docker Compose, including pre-built dashboards and meaningful alerts.

02Understanding the Monitoring Stack

The modern monitoring stack has three layers: Collection: Prometheus scrapes metrics from your services every 15 seconds. It pulls data from exporters — small programs that expose metrics in a standard format. There are exporters for almost everything: Node Exporter for system metrics, cAdvisor for container metrics, Blackbox Exporter for endpoint probing. Visualization: Grafana turns raw metrics into beautiful dashboards. It queries Prometheus and renders graphs, gauges, tables, and heatmaps. The community has created thousands of pre-built dashboards you can import with one click. Alerting: AlertManager or Grafana Alerting sends notifications when things go wrong. You define rules like "alert me if disk usage exceeds 85%" and get notified via Telegram, Discord, Slack, or email. All three components run as Docker containers and work together seamlessly.

03Setting Up the Stack

Here's a minimal monitoring stack that gives you system metrics, container metrics, and a Grafana dashboard:

[docker-compose.yml]

1services: 
2  prometheus: 
3    image: prom/prometheus:v2.53.0
4    container_name: prometheus
5    restart: unless-stopped
6    volumes: 
7      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
8      - prometheus_data:/prometheus
9    command: 
10      - "--config.file=/etc/prometheus/prometheus.yml"
11      - "--storage.tsdb.retention.time=30d"
12    ports: 
13      - "9090:9090"
14
15  grafana: 
16    image: grafana/grafana:11.1.0
17    container_name: grafana
18    restart: unless-stopped
19    volumes: 
20      - grafana_data:/var/lib/grafana
21    environment: 
22      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
23    ports: 
24      - "3000:3000"
25
26  node-exporter: 
27    image: prom/node-exporter:v1.8.1
28    container_name: node-exporter
29    restart: unless-stopped
30    pid: host
31    volumes: 
32      - /proc:/host/proc:ro
33      - /sys:/host/sys:ro
34      - /:/rootfs:ro
35
36  cadvisor: 
37    image: gcr.io/cadvisor/cadvisor:v0.49.1
38    container_name: cadvisor
39    restart: unless-stopped
40    privileged: true
41    volumes: 
42      - /:/rootfs:ro
43      - /var/run:/var/run:ro
44      - /sys:/sys:ro
45      - /var/lib/docker/:/var/lib/docker:ro
46
47volumes: 
48  prometheus_data: 
49  grafana_data:

Start with 30 days of data retention. Prometheus is efficient — even with dozens of metrics scraped every 15 seconds, a month of data typically uses less than 1GB.

04Essential Dashboards

After starting the stack, open Grafana at http://localhost:3000, add Prometheus as a data source (URL: http://prometheus:9090), and import these community dashboards by ID: Node Exporter Full (ID: 1860): Shows CPU, memory, disk, network, and dozens of other system metrics. The most popular dashboard on Grafana.com for good reason. Docker Container Monitoring (ID: 893): Per-container CPU, memory, network, and disk usage. See at a glance which containers are consuming the most resources. To import: click + in Grafana, select "Import dashboard," enter the ID, select your Prometheus data source, and click Import. I also recommend creating a custom "Overview" dashboard with the metrics you care about most. Mine shows total CPU usage, available disk space, container count, and uptime for critical services. Browse our monitoring recipes for complete configurations with pre-built dashboards.

05Setting Up Meaningful Alerts

The key to good alerting is being selective. Alert on things that require action. My recommended starter alerts: - Disk space below 15% on any volume - Any container restarting more than 3 times in 5 minutes - Host memory usage above 90% for 10+ minutes - SSL certificate expiring within 14 days - Any HTTP endpoint returning non-200 for 2+ minutes For personal projects, Telegram notifications are my favorite — alerts show up on my phone instantly. Configure through Grafana's built-in alerting, which supports dozens of notification channels.

Avoid alert fatigue. If you get more than a few alerts per week, your thresholds are too aggressive. Each alert should represent something you actually need to investigate.