Monitoring with Prometheus and Grafana

01The Monitoring Stack

A complete monitoring setup typically includes: **cAdvisor** - Collects container metrics from Docker **Prometheus** - Stores and queries time-series data **Grafana** - Visualizes metrics in dashboards Optional additions: • Node Exporter - Host system metrics • Alertmanager - Sends alerts when things go wrong This stack is lightweight enough for home labs but scales to production.

Start with basic container metrics, then add host metrics and alerting as needed.

02Basic Monitoring Stack

Deploy the complete monitoring stack with Docker Compose.

[yaml]

1services: 
2  # Collects container metrics
3  cadvisor: 
4    image: gcr.io/cadvisor/cadvisor:latest
5    container_name: cadvisor
6    privileged: true
7    devices: 
8      - /dev/kmsg:/dev/kmsg
9    volumes: 
10      - /:/rootfs:ro
11      - /var/run:/var/run:ro
12      - /sys:/sys:ro
13      - /var/lib/docker/:/var/lib/docker:ro
14    restart: unless-stopped
15
16  # Stores metrics
17  prometheus: 
18    image: prom/prometheus:latest
19    container_name: prometheus
20    volumes: 
21      - ./prometheus.yml:/etc/prometheus/prometheus.yml
22      - prometheus_data:/prometheus
23    command: 
24      - '--config.file=/etc/prometheus/prometheus.yml'
25      - '--storage.tsdb.retention.time=15d'
26    restart: unless-stopped
27
28  # Visualizes metrics
29  grafana: 
30    image: grafana/grafana:latest
31    container_name: grafana
32    environment: 
33      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
34      - GF_USERS_ALLOW_SIGN_UP=false
35    volumes: 
36      - grafana_data:/var/lib/grafana
37    ports: 
38      - "3000:3000"
39    restart: unless-stopped
40
41volumes: 
42  prometheus_data: 
43  grafana_data:

03Prometheus Configuration

Configure Prometheus to scrape metrics from cAdvisor and optionally other exporters.

[yaml]

1# prometheus.yml
2global: 
3  scrape_interval: 15s
4  evaluation_interval: 15s
5
6scrape_configs: 
7  # Prometheus itself
8  - job_name: 'prometheus'
9    static_configs: 
10      - targets: ['localhost:9090']
11
12  # Container metrics from cAdvisor
13  - job_name: 'cadvisor'
14    static_configs: 
15      - targets: ['cadvisor:8080']
16
17  # Optional: Host metrics from Node Exporter
18  - job_name: 'node'
19    static_configs: 
20      - targets: ['node-exporter:9100']
21
22  # Optional: Docker daemon metrics
23  - job_name: 'docker'
24    static_configs: 
25      - targets: ['host.docker.internal:9323']

Adjust scrape_interval based on your needs. 15s is good for real-time monitoring; longer intervals reduce storage needs.

04Setting Up Grafana Dashboards

Import pre-built dashboards for instant visibility into your containers.

[bash]

1# 1. Access Grafana at http://your-host:3000
2# 2. Login with admin / your-password
3
4# 3. Add Prometheus data source:
5#    - Go to Configuration > Data Sources
6#    - Add Prometheus
7#    - URL: http://prometheus:9090
8#    - Click "Save & Test"
9
10# 4. Import dashboards:
11#    - Go to Dashboards > Import
12#    - Enter dashboard ID and click Load
13
14# Recommended dashboard IDs:
15# - 193: Docker monitoring (cAdvisor)
16# - 1860: Node Exporter Full (host metrics)
17# - 11074: Docker Dashboard
18# - 14282: cAdvisor + Prometheus
19
20# Or create your own with these queries:
21# Container CPU: rate(container_cpu_usage_seconds_total[1m])
22# Container Memory: container_memory_usage_bytes
23# Container Network: rate(container_network_transmit_bytes_total[1m])

05Adding Host Metrics

Add Node Exporter to monitor the host system (CPU, memory, disk, network).

[yaml]

1services: 
2  node-exporter: 
3    image: prom/node-exporter:latest
4    container_name: node-exporter
5    command: 
6      - '--path.rootfs=/host'
7      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
8    volumes: 
9      - /:/host:ro,rslave
10    pid: host
11    restart: unless-stopped
12
13  # Update prometheus.yml to scrape node-exporter:
14  # - job_name: 'node'
15  #   static_configs:
16  #     - targets: ['node-exporter:9100']

Node Exporter provides detailed host metrics. Dashboard ID 1860 is excellent for visualizing this data.

06Setting Up Alerts

Configure alerting to get notified when things go wrong. You can alert from Prometheus (via Alertmanager) or Grafana.

[yaml]

1# Grafana alerting (simpler for small setups)
2# 1. Edit a panel in Grafana
3# 2. Go to Alert tab
4# 3. Create alert rule
5# 4. Configure notification channel (email, Slack, etc.)
6
7# Prometheus alerting (more powerful)
8# Add to prometheus.yml:
9rule_files: 
10  - "alerts.yml"
11
12alerting: 
13  alertmanagers: 
14    - static_configs:
15        - targets: ['alertmanager:9093']
16
17---
18# alerts.yml
19groups: 
20  - name: container-alerts
21    rules: 
22      - alert: ContainerDown
23        expr: absent(container_last_seen{name="important-container"})
24        for: 1m
25        labels: 
26          severity: critical
27        annotations: 
28          summary: "Container is down"
29
30      - alert: HighCPU
31        expr: rate(container_cpu_usage_seconds_total[5m]) > 0.9
32        for: 5m
33        labels: 
34          severity: warning
35        annotations: 
36          summary: "Container CPU usage > 90%"
37
38      - alert: HighMemory
39        expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.9
40        for: 5m
41        labels: 
42          severity: warning
43        annotations: 
44          summary: "Container memory usage > 90%"

07Lightweight Alternative: Uptime Kuma

For simpler setups, Uptime Kuma provides status monitoring without the full Prometheus stack.

[yaml]

1services: 
2  uptime-kuma: 
3    image: louislam/uptime-kuma:1
4    container_name: uptime-kuma
5    volumes: 
6      - ./uptime-kuma:/app/data
7    ports: 
8      - "3001:3001"
9    restart: unless-stopped
10
11# Features:
12# - HTTP/HTTPS monitoring
13# - TCP port monitoring
14# - Docker container monitoring
15# - DNS monitoring
16# - Multiple notification channels
17# - Status pages
18# - No configuration files needed
19
20# For Docker monitoring, add monitors for:
21# - Container HTTP endpoints
22# - TCP ports your services expose
23# - Docker socket (advanced)

Uptime Kuma is perfect for smaller setups. Use the full Prometheus stack when you need detailed metrics and historical analysis.