docker.recipes

Grafana OnCall

intermediate

On-call management system integrated with Grafana alerting.

Overview

Grafana OnCall is an open-source incident response and on-call management system developed by Grafana Labs. It provides comprehensive alerting, escalation policies, and team collaboration features for managing critical incidents, serving as a cost-effective alternative to commercial solutions like PagerDuty and Opsgenie. The platform integrates natively with Grafana's monitoring ecosystem while supporting multiple notification channels including Slack, Telegram, SMS, and phone calls. This deployment creates a complete on-call management infrastructure using six interconnected services: a Grafana instance with the OnCall plugin pre-installed, the main oncall-engine service running the web application via uWSGI, oncall-celery for background task processing and scheduled operations, PostgreSQL for persistent data storage, Redis for caching and session management, and RabbitMQ as the message broker for task distribution. The oncall-engine handles API requests and user interactions while oncall-celery processes notifications, escalations, and periodic maintenance tasks in the background. This stack is ideal for DevOps teams, SREs, and organizations seeking enterprise-grade incident management capabilities without vendor lock-in or per-user pricing. The combination provides robust scheduling, intelligent alert grouping, escalation chains, and detailed incident tracking while maintaining complete control over sensitive operational data and notification workflows.

Key Features

  • Native Grafana integration with pre-configured OnCall plugin for unified monitoring and alerting workflows
  • Intelligent alert grouping and deduplication to reduce notification fatigue during incident storms
  • Flexible escalation policies with multiple notification channels including Slack, Telegram, SMS, and voice calls
  • Advanced scheduling system supporting rotations, overrides, and time-zone aware on-call management
  • Real-time incident collaboration with acknowledgments, resolution tracking, and team communication
  • Background task processing via Celery for reliable notification delivery and scheduled operations
  • PostgreSQL-backed incident history and analytics with full audit trails and reporting capabilities
  • Multi-tenancy support with team-based access controls and customizable notification preferences

Common Use Cases

  • 124/7 production monitoring for SaaS applications requiring immediate incident response
  • 2DevOps teams managing microservices architectures with complex alert routing and escalation needs
  • 3Organizations migrating from expensive commercial on-call solutions like PagerDuty or Opsgenie
  • 4Enterprise environments requiring on-premises incident management with strict data sovereignty requirements
  • 5Multi-team engineering organizations needing centralized on-call scheduling and rotation management
  • 6Startups building scalable incident response processes without per-user licensing costs
  • 7Healthcare or financial services requiring compliant incident tracking and audit capabilities

Prerequisites

  • Docker and Docker Compose installed with minimum 4GB RAM available for all services
  • Port 3000 available for Grafana web interface access
  • Valid SMTP configuration or external notification service credentials for alert delivery
  • Environment variables configured: GRAFANA_PASSWORD, DB_PASSWORD, RABBITMQ_PASSWORD, SECRET_KEY, BASE_URL
  • Understanding of Grafana dashboards and alerting rules for integration with existing monitoring
  • Basic knowledge of on-call management concepts like escalation policies and rotation schedules

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 grafana:
3 image: grafana/grafana:latest
4 container_name: grafana
5 environment:
6 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
7 - GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=grafana-oncall-app
8 - GF_INSTALL_PLUGINS=grafana-oncall-app
9 volumes:
10 - grafana-data:/var/lib/grafana
11 ports:
12 - "3000:3000"
13 networks:
14 - oncall-network
15 restart: unless-stopped
16
17 oncall-engine:
18 image: grafana/oncall:latest
19 container_name: oncall-engine
20 command: uwsgi --ini uwsgi.ini
21 environment:
22 - DATABASE_TYPE=postgresql
23 - DATABASE_HOST=db
24 - DATABASE_NAME=oncall
25 - DATABASE_USER=oncall
26 - DATABASE_PASSWORD=${DB_PASSWORD}
27 - BROKER_TYPE=rabbitmq
28 - RABBITMQ_HOST=rabbitmq
29 - RABBITMQ_USERNAME=oncall
30 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
31 - REDIS_URI=redis://redis:6379
32 - SECRET_KEY=${SECRET_KEY}
33 - BASE_URL=${BASE_URL}
34 - GRAFANA_API_URL=http://grafana:3000
35 depends_on:
36 - db
37 - redis
38 - rabbitmq
39 networks:
40 - oncall-network
41 restart: unless-stopped
42
43 oncall-celery:
44 image: grafana/oncall:latest
45 container_name: oncall-celery
46 command: celery -A engine worker -l info -B --scheduler=django_celery_beat.schedulers:DatabaseScheduler
47 environment:
48 - DATABASE_TYPE=postgresql
49 - DATABASE_HOST=db
50 - DATABASE_NAME=oncall
51 - DATABASE_USER=oncall
52 - DATABASE_PASSWORD=${DB_PASSWORD}
53 - BROKER_TYPE=rabbitmq
54 - RABBITMQ_HOST=rabbitmq
55 - RABBITMQ_USERNAME=oncall
56 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
57 - REDIS_URI=redis://redis:6379
58 - SECRET_KEY=${SECRET_KEY}
59 depends_on:
60 - oncall-engine
61 networks:
62 - oncall-network
63 restart: unless-stopped
64
65 db:
66 image: postgres:15-alpine
67 container_name: oncall-db
68 environment:
69 - POSTGRES_USER=oncall
70 - POSTGRES_PASSWORD=${DB_PASSWORD}
71 - POSTGRES_DB=oncall
72 volumes:
73 - postgres-data:/var/lib/postgresql/data
74 networks:
75 - oncall-network
76 restart: unless-stopped
77
78 redis:
79 image: redis:7-alpine
80 container_name: oncall-redis
81 volumes:
82 - redis-data:/data
83 networks:
84 - oncall-network
85 restart: unless-stopped
86
87 rabbitmq:
88 image: rabbitmq:3-management-alpine
89 container_name: oncall-rabbitmq
90 environment:
91 - RABBITMQ_DEFAULT_USER=oncall
92 - RABBITMQ_DEFAULT_PASS=${RABBITMQ_PASSWORD}
93 volumes:
94 - rabbitmq-data:/var/lib/rabbitmq
95 networks:
96 - oncall-network
97 restart: unless-stopped
98
99volumes:
100 grafana-data:
101 postgres-data:
102 redis-data:
103 rabbitmq-data:
104
105networks:
106 oncall-network:
107 driver: bridge

.env Template

.env
1# Grafana OnCall
2BASE_URL=http://localhost:8080
3GRAFANA_PASSWORD=secure_grafana_password
4DB_PASSWORD=secure_oncall_password
5RABBITMQ_PASSWORD=secure_rabbitmq_password
6
7# Generate with: openssl rand -hex 32
8SECRET_KEY=your_secret_key_here

Usage Notes

  1. 1Grafana at http://localhost:3000
  2. 2Enable OnCall plugin in Grafana
  3. 3Create schedules and escalation policies
  4. 4Integrates with Slack, Telegram, etc.
  5. 5PagerDuty/Opsgenie alternative

Individual Services(6 services)

Copy individual services to mix and match with your existing compose files.

grafana
grafana:
  image: grafana/grafana:latest
  container_name: grafana
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    - GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=grafana-oncall-app
    - GF_INSTALL_PLUGINS=grafana-oncall-app
  volumes:
    - grafana-data:/var/lib/grafana
  ports:
    - "3000:3000"
  networks:
    - oncall-network
  restart: unless-stopped
oncall-engine
oncall-engine:
  image: grafana/oncall:latest
  container_name: oncall-engine
  command: uwsgi --ini uwsgi.ini
  environment:
    - DATABASE_TYPE=postgresql
    - DATABASE_HOST=db
    - DATABASE_NAME=oncall
    - DATABASE_USER=oncall
    - DATABASE_PASSWORD=${DB_PASSWORD}
    - BROKER_TYPE=rabbitmq
    - RABBITMQ_HOST=rabbitmq
    - RABBITMQ_USERNAME=oncall
    - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
    - REDIS_URI=redis://redis:6379
    - SECRET_KEY=${SECRET_KEY}
    - BASE_URL=${BASE_URL}
    - GRAFANA_API_URL=http://grafana:3000
  depends_on:
    - db
    - redis
    - rabbitmq
  networks:
    - oncall-network
  restart: unless-stopped
oncall-celery
oncall-celery:
  image: grafana/oncall:latest
  container_name: oncall-celery
  command: celery -A engine worker -l info -B --scheduler=django_celery_beat.schedulers:DatabaseScheduler
  environment:
    - DATABASE_TYPE=postgresql
    - DATABASE_HOST=db
    - DATABASE_NAME=oncall
    - DATABASE_USER=oncall
    - DATABASE_PASSWORD=${DB_PASSWORD}
    - BROKER_TYPE=rabbitmq
    - RABBITMQ_HOST=rabbitmq
    - RABBITMQ_USERNAME=oncall
    - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
    - REDIS_URI=redis://redis:6379
    - SECRET_KEY=${SECRET_KEY}
  depends_on:
    - oncall-engine
  networks:
    - oncall-network
  restart: unless-stopped
db
db:
  image: postgres:15-alpine
  container_name: oncall-db
  environment:
    - POSTGRES_USER=oncall
    - POSTGRES_PASSWORD=${DB_PASSWORD}
    - POSTGRES_DB=oncall
  volumes:
    - postgres-data:/var/lib/postgresql/data
  networks:
    - oncall-network
  restart: unless-stopped
redis
redis:
  image: redis:7-alpine
  container_name: oncall-redis
  volumes:
    - redis-data:/data
  networks:
    - oncall-network
  restart: unless-stopped
rabbitmq
rabbitmq:
  image: rabbitmq:3-management-alpine
  container_name: oncall-rabbitmq
  environment:
    - RABBITMQ_DEFAULT_USER=oncall
    - RABBITMQ_DEFAULT_PASS=${RABBITMQ_PASSWORD}
  volumes:
    - rabbitmq-data:/var/lib/rabbitmq
  networks:
    - oncall-network
  restart: unless-stopped

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 grafana:
5 image: grafana/grafana:latest
6 container_name: grafana
7 environment:
8 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
9 - GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=grafana-oncall-app
10 - GF_INSTALL_PLUGINS=grafana-oncall-app
11 volumes:
12 - grafana-data:/var/lib/grafana
13 ports:
14 - "3000:3000"
15 networks:
16 - oncall-network
17 restart: unless-stopped
18
19 oncall-engine:
20 image: grafana/oncall:latest
21 container_name: oncall-engine
22 command: uwsgi --ini uwsgi.ini
23 environment:
24 - DATABASE_TYPE=postgresql
25 - DATABASE_HOST=db
26 - DATABASE_NAME=oncall
27 - DATABASE_USER=oncall
28 - DATABASE_PASSWORD=${DB_PASSWORD}
29 - BROKER_TYPE=rabbitmq
30 - RABBITMQ_HOST=rabbitmq
31 - RABBITMQ_USERNAME=oncall
32 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
33 - REDIS_URI=redis://redis:6379
34 - SECRET_KEY=${SECRET_KEY}
35 - BASE_URL=${BASE_URL}
36 - GRAFANA_API_URL=http://grafana:3000
37 depends_on:
38 - db
39 - redis
40 - rabbitmq
41 networks:
42 - oncall-network
43 restart: unless-stopped
44
45 oncall-celery:
46 image: grafana/oncall:latest
47 container_name: oncall-celery
48 command: celery -A engine worker -l info -B --scheduler=django_celery_beat.schedulers:DatabaseScheduler
49 environment:
50 - DATABASE_TYPE=postgresql
51 - DATABASE_HOST=db
52 - DATABASE_NAME=oncall
53 - DATABASE_USER=oncall
54 - DATABASE_PASSWORD=${DB_PASSWORD}
55 - BROKER_TYPE=rabbitmq
56 - RABBITMQ_HOST=rabbitmq
57 - RABBITMQ_USERNAME=oncall
58 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
59 - REDIS_URI=redis://redis:6379
60 - SECRET_KEY=${SECRET_KEY}
61 depends_on:
62 - oncall-engine
63 networks:
64 - oncall-network
65 restart: unless-stopped
66
67 db:
68 image: postgres:15-alpine
69 container_name: oncall-db
70 environment:
71 - POSTGRES_USER=oncall
72 - POSTGRES_PASSWORD=${DB_PASSWORD}
73 - POSTGRES_DB=oncall
74 volumes:
75 - postgres-data:/var/lib/postgresql/data
76 networks:
77 - oncall-network
78 restart: unless-stopped
79
80 redis:
81 image: redis:7-alpine
82 container_name: oncall-redis
83 volumes:
84 - redis-data:/data
85 networks:
86 - oncall-network
87 restart: unless-stopped
88
89 rabbitmq:
90 image: rabbitmq:3-management-alpine
91 container_name: oncall-rabbitmq
92 environment:
93 - RABBITMQ_DEFAULT_USER=oncall
94 - RABBITMQ_DEFAULT_PASS=${RABBITMQ_PASSWORD}
95 volumes:
96 - rabbitmq-data:/var/lib/rabbitmq
97 networks:
98 - oncall-network
99 restart: unless-stopped
100
101volumes:
102 grafana-data:
103 postgres-data:
104 redis-data:
105 rabbitmq-data:
106
107networks:
108 oncall-network:
109 driver: bridge
110EOF
111
112# 2. Create the .env file
113cat > .env << 'EOF'
114# Grafana OnCall
115BASE_URL=http://localhost:8080
116GRAFANA_PASSWORD=secure_grafana_password
117DB_PASSWORD=secure_oncall_password
118RABBITMQ_PASSWORD=secure_rabbitmq_password
119
120# Generate with: openssl rand -hex 32
121SECRET_KEY=your_secret_key_here
122EOF
123
124# 3. Start the services
125docker compose up -d
126
127# 4. View logs
128docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/grafana-oncall/run | bash

Troubleshooting

  • OnCall plugin not visible in Grafana: Verify GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS is set and restart grafana container
  • oncall-engine startup fails with database errors: Ensure db container is fully initialized before oncall-engine starts, check DATABASE_PASSWORD environment variable
  • Celery workers not processing tasks: Verify rabbitmq container is running and RABBITMQ_PASSWORD matches between oncall-celery and rabbitmq services
  • Notifications not being delivered: Check celery worker logs in oncall-celery container and verify external notification service credentials
  • Redis connection timeouts: Increase Redis memory limits or check for memory pressure on host system affecting redis container performance
  • Grafana API integration errors: Verify GRAFANA_API_URL points to correct internal service name (http://grafana:3000) and admin credentials are correct

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Components

grafanaoncall-engineoncall-celerypostgresqlredisrabbitmq

Tags

#oncall#alerting#grafana#incident-management#pagerduty-alternative

Category

Monitoring & Observability
Ad Space