01The Monitoring Stack
A complete monitoring setup typically includes:
**cAdvisor** - Collects container metrics from Docker
**Prometheus** - Stores and queries time-series data
**Grafana** - Visualizes metrics in dashboards
Optional additions:
• Node Exporter - Host system metrics
• Alertmanager - Sends alerts when things go wrong
This stack is lightweight enough for home labs but scales to production.
Start with basic container metrics, then add host metrics and alerting as needed.
02Basic Monitoring Stack
Deploy the complete monitoring stack with Docker Compose.
1services: 2 # Collects container metrics3 cadvisor: 4 image: gcr.io/cadvisor/cadvisor:latest5 container_name: cadvisor6 privileged: true7 devices: 8 - /dev/kmsg:/dev/kmsg9 volumes: 10 - /:/rootfs:ro11 - /var/run:/var/run:ro12 - /sys:/sys:ro13 - /var/lib/docker/:/var/lib/docker:ro14 restart: unless-stopped1516 # Stores metrics17 prometheus: 18 image: prom/prometheus:latest19 container_name: prometheus20 volumes: 21 - ./prometheus.yml:/etc/prometheus/prometheus.yml22 - prometheus_data:/prometheus23 command: 24 - '--config.file=/etc/prometheus/prometheus.yml'25 - '--storage.tsdb.retention.time=15d'26 restart: unless-stopped2728 # Visualizes metrics29 grafana: 30 image: grafana/grafana:latest31 container_name: grafana32 environment: 33 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}34 - GF_USERS_ALLOW_SIGN_UP=false35 volumes: 36 - grafana_data:/var/lib/grafana37 ports: 38 - "3000:3000"39 restart: unless-stopped4041volumes: 42 prometheus_data: 43 grafana_data: 03Prometheus Configuration
Configure Prometheus to scrape metrics from cAdvisor and optionally other exporters.
1# prometheus.yml2global: 3 scrape_interval: 15s4 evaluation_interval: 15s56scrape_configs: 7 # Prometheus itself8 - job_name: 'prometheus'9 static_configs: 10 - targets: ['localhost:9090']1112 # Container metrics from cAdvisor13 - job_name: 'cadvisor'14 static_configs: 15 - targets: ['cadvisor:8080']1617 # Optional: Host metrics from Node Exporter18 - job_name: 'node'19 static_configs: 20 - targets: ['node-exporter:9100']2122 # Optional: Docker daemon metrics23 - job_name: 'docker'24 static_configs: 25 - targets: ['host.docker.internal:9323']Adjust scrape_interval based on your needs. 15s is good for real-time monitoring; longer intervals reduce storage needs.
04Setting Up Grafana Dashboards
Import pre-built dashboards for instant visibility into your containers.
1# 1. Access Grafana at http://your-host:30002# 2. Login with admin / your-password34# 3. Add Prometheus data source:5# - Go to Configuration > Data Sources6# - Add Prometheus7# - URL: http://prometheus:90908# - Click "Save & Test"910# 4. Import dashboards:11# - Go to Dashboards > Import12# - Enter dashboard ID and click Load1314# Recommended dashboard IDs:15# - 193: Docker monitoring (cAdvisor)16# - 1860: Node Exporter Full (host metrics)17# - 11074: Docker Dashboard18# - 14282: cAdvisor + Prometheus1920# Or create your own with these queries:21# Container CPU: rate(container_cpu_usage_seconds_total[1m])22# Container Memory: container_memory_usage_bytes23# Container Network: rate(container_network_transmit_bytes_total[1m])05Adding Host Metrics
Add Node Exporter to monitor the host system (CPU, memory, disk, network).
1services: 2 node-exporter: 3 image: prom/node-exporter:latest4 container_name: node-exporter5 command: 6 - '--path.rootfs=/host'7 - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'8 volumes: 9 - /:/host:ro,rslave10 pid: host11 restart: unless-stopped1213 # Update prometheus.yml to scrape node-exporter:14 # - job_name: 'node'15 # static_configs:16 # - targets: ['node-exporter:9100']Node Exporter provides detailed host metrics. Dashboard ID 1860 is excellent for visualizing this data.
06Setting Up Alerts
Configure alerting to get notified when things go wrong. You can alert from Prometheus (via Alertmanager) or Grafana.
1# Grafana alerting (simpler for small setups)2# 1. Edit a panel in Grafana3# 2. Go to Alert tab4# 3. Create alert rule5# 4. Configure notification channel (email, Slack, etc.)67# Prometheus alerting (more powerful)8# Add to prometheus.yml:9rule_files: 10 - "alerts.yml"1112alerting: 13 alertmanagers: 14 - static_configs:15 - targets: ['alertmanager:9093']1617---18# alerts.yml19groups: 20 - name: container-alerts21 rules: 22 - alert: ContainerDown23 expr: absent(container_last_seen{name="important-container"})24 for: 1m25 labels: 26 severity: critical27 annotations: 28 summary: "Container is down"2930 - alert: HighCPU31 expr: rate(container_cpu_usage_seconds_total[5m]) > 0.932 for: 5m33 labels: 34 severity: warning35 annotations: 36 summary: "Container CPU usage > 90%"3738 - alert: HighMemory39 expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.940 for: 5m41 labels: 42 severity: warning43 annotations: 44 summary: "Container memory usage > 90%"07Lightweight Alternative: Uptime Kuma
For simpler setups, Uptime Kuma provides status monitoring without the full Prometheus stack.
1services: 2 uptime-kuma: 3 image: louislam/uptime-kuma:14 container_name: uptime-kuma5 volumes: 6 - ./uptime-kuma:/app/data7 ports: 8 - "3001:3001"9 restart: unless-stopped1011# Features:12# - HTTP/HTTPS monitoring13# - TCP port monitoring14# - Docker container monitoring15# - DNS monitoring16# - Multiple notification channels17# - Status pages18# - No configuration files needed1920# For Docker monitoring, add monitors for:21# - Container HTTP endpoints22# - TCP ports your services expose23# - Docker socket (advanced)Uptime Kuma is perfect for smaller setups. Use the full Prometheus stack when you need detailed metrics and historical analysis.