Grafana OnCall
On-call management system integrated with Grafana alerting.
Overview
Grafana OnCall is an open-source incident response and on-call management system developed by Grafana Labs. It provides comprehensive alerting, escalation policies, and team collaboration features for managing critical incidents, serving as a cost-effective alternative to commercial solutions like PagerDuty and Opsgenie. The platform integrates natively with Grafana's monitoring ecosystem while supporting multiple notification channels including Slack, Telegram, SMS, and phone calls.
This deployment creates a complete on-call management infrastructure using six interconnected services: a Grafana instance with the OnCall plugin pre-installed, the main oncall-engine service running the web application via uWSGI, oncall-celery for background task processing and scheduled operations, PostgreSQL for persistent data storage, Redis for caching and session management, and RabbitMQ as the message broker for task distribution. The oncall-engine handles API requests and user interactions while oncall-celery processes notifications, escalations, and periodic maintenance tasks in the background.
This stack is ideal for DevOps teams, SREs, and organizations seeking enterprise-grade incident management capabilities without vendor lock-in or per-user pricing. The combination provides robust scheduling, intelligent alert grouping, escalation chains, and detailed incident tracking while maintaining complete control over sensitive operational data and notification workflows.
Key Features
- Native Grafana integration with pre-configured OnCall plugin for unified monitoring and alerting workflows
- Intelligent alert grouping and deduplication to reduce notification fatigue during incident storms
- Flexible escalation policies with multiple notification channels including Slack, Telegram, SMS, and voice calls
- Advanced scheduling system supporting rotations, overrides, and time-zone aware on-call management
- Real-time incident collaboration with acknowledgments, resolution tracking, and team communication
- Background task processing via Celery for reliable notification delivery and scheduled operations
- PostgreSQL-backed incident history and analytics with full audit trails and reporting capabilities
- Multi-tenancy support with team-based access controls and customizable notification preferences
Common Use Cases
- 124/7 production monitoring for SaaS applications requiring immediate incident response
- 2DevOps teams managing microservices architectures with complex alert routing and escalation needs
- 3Organizations migrating from expensive commercial on-call solutions like PagerDuty or Opsgenie
- 4Enterprise environments requiring on-premises incident management with strict data sovereignty requirements
- 5Multi-team engineering organizations needing centralized on-call scheduling and rotation management
- 6Startups building scalable incident response processes without per-user licensing costs
- 7Healthcare or financial services requiring compliant incident tracking and audit capabilities
Prerequisites
- Docker and Docker Compose installed with minimum 4GB RAM available for all services
- Port 3000 available for Grafana web interface access
- Valid SMTP configuration or external notification service credentials for alert delivery
- Environment variables configured: GRAFANA_PASSWORD, DB_PASSWORD, RABBITMQ_PASSWORD, SECRET_KEY, BASE_URL
- Understanding of Grafana dashboards and alerting rules for integration with existing monitoring
- Basic knowledge of on-call management concepts like escalation policies and rotation schedules
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 grafana: 3 image: grafana/grafana:latest4 container_name: grafana5 environment: 6 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}7 - GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=grafana-oncall-app8 - GF_INSTALL_PLUGINS=grafana-oncall-app9 volumes: 10 - grafana-data:/var/lib/grafana11 ports: 12 - "3000:3000"13 networks: 14 - oncall-network15 restart: unless-stopped1617 oncall-engine: 18 image: grafana/oncall:latest19 container_name: oncall-engine20 command: uwsgi --ini uwsgi.ini21 environment: 22 - DATABASE_TYPE=postgresql23 - DATABASE_HOST=db24 - DATABASE_NAME=oncall25 - DATABASE_USER=oncall26 - DATABASE_PASSWORD=${DB_PASSWORD}27 - BROKER_TYPE=rabbitmq28 - RABBITMQ_HOST=rabbitmq29 - RABBITMQ_USERNAME=oncall30 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}31 - REDIS_URI=redis://redis:637932 - SECRET_KEY=${SECRET_KEY}33 - BASE_URL=${BASE_URL}34 - GRAFANA_API_URL=http://grafana:300035 depends_on: 36 - db37 - redis38 - rabbitmq39 networks: 40 - oncall-network41 restart: unless-stopped4243 oncall-celery: 44 image: grafana/oncall:latest45 container_name: oncall-celery46 command: celery -A engine worker -l info -B --scheduler=django_celery_beat.schedulers:DatabaseScheduler47 environment: 48 - DATABASE_TYPE=postgresql49 - DATABASE_HOST=db50 - DATABASE_NAME=oncall51 - DATABASE_USER=oncall52 - DATABASE_PASSWORD=${DB_PASSWORD}53 - BROKER_TYPE=rabbitmq54 - RABBITMQ_HOST=rabbitmq55 - RABBITMQ_USERNAME=oncall56 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}57 - REDIS_URI=redis://redis:637958 - SECRET_KEY=${SECRET_KEY}59 depends_on: 60 - oncall-engine61 networks: 62 - oncall-network63 restart: unless-stopped6465 db: 66 image: postgres:15-alpine67 container_name: oncall-db68 environment: 69 - POSTGRES_USER=oncall70 - POSTGRES_PASSWORD=${DB_PASSWORD}71 - POSTGRES_DB=oncall72 volumes: 73 - postgres-data:/var/lib/postgresql/data74 networks: 75 - oncall-network76 restart: unless-stopped7778 redis: 79 image: redis:7-alpine80 container_name: oncall-redis81 volumes: 82 - redis-data:/data83 networks: 84 - oncall-network85 restart: unless-stopped8687 rabbitmq: 88 image: rabbitmq:3-management-alpine89 container_name: oncall-rabbitmq90 environment: 91 - RABBITMQ_DEFAULT_USER=oncall92 - RABBITMQ_DEFAULT_PASS=${RABBITMQ_PASSWORD}93 volumes: 94 - rabbitmq-data:/var/lib/rabbitmq95 networks: 96 - oncall-network97 restart: unless-stopped9899volumes: 100 grafana-data: 101 postgres-data: 102 redis-data: 103 rabbitmq-data: 104105networks: 106 oncall-network: 107 driver: bridge.env Template
.env
1# Grafana OnCall2BASE_URL=http://localhost:80803GRAFANA_PASSWORD=secure_grafana_password4DB_PASSWORD=secure_oncall_password5RABBITMQ_PASSWORD=secure_rabbitmq_password67# Generate with: openssl rand -hex 328SECRET_KEY=your_secret_key_hereUsage Notes
- 1Grafana at http://localhost:3000
- 2Enable OnCall plugin in Grafana
- 3Create schedules and escalation policies
- 4Integrates with Slack, Telegram, etc.
- 5PagerDuty/Opsgenie alternative
Individual Services(6 services)
Copy individual services to mix and match with your existing compose files.
grafana
grafana:
image: grafana/grafana:latest
container_name: grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
- GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=grafana-oncall-app
- GF_INSTALL_PLUGINS=grafana-oncall-app
volumes:
- grafana-data:/var/lib/grafana
ports:
- "3000:3000"
networks:
- oncall-network
restart: unless-stopped
oncall-engine
oncall-engine:
image: grafana/oncall:latest
container_name: oncall-engine
command: uwsgi --ini uwsgi.ini
environment:
- DATABASE_TYPE=postgresql
- DATABASE_HOST=db
- DATABASE_NAME=oncall
- DATABASE_USER=oncall
- DATABASE_PASSWORD=${DB_PASSWORD}
- BROKER_TYPE=rabbitmq
- RABBITMQ_HOST=rabbitmq
- RABBITMQ_USERNAME=oncall
- RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
- REDIS_URI=redis://redis:6379
- SECRET_KEY=${SECRET_KEY}
- BASE_URL=${BASE_URL}
- GRAFANA_API_URL=http://grafana:3000
depends_on:
- db
- redis
- rabbitmq
networks:
- oncall-network
restart: unless-stopped
oncall-celery
oncall-celery:
image: grafana/oncall:latest
container_name: oncall-celery
command: celery -A engine worker -l info -B --scheduler=django_celery_beat.schedulers:DatabaseScheduler
environment:
- DATABASE_TYPE=postgresql
- DATABASE_HOST=db
- DATABASE_NAME=oncall
- DATABASE_USER=oncall
- DATABASE_PASSWORD=${DB_PASSWORD}
- BROKER_TYPE=rabbitmq
- RABBITMQ_HOST=rabbitmq
- RABBITMQ_USERNAME=oncall
- RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}
- REDIS_URI=redis://redis:6379
- SECRET_KEY=${SECRET_KEY}
depends_on:
- oncall-engine
networks:
- oncall-network
restart: unless-stopped
db
db:
image: postgres:15-alpine
container_name: oncall-db
environment:
- POSTGRES_USER=oncall
- POSTGRES_PASSWORD=${DB_PASSWORD}
- POSTGRES_DB=oncall
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- oncall-network
restart: unless-stopped
redis
redis:
image: redis:7-alpine
container_name: oncall-redis
volumes:
- redis-data:/data
networks:
- oncall-network
restart: unless-stopped
rabbitmq
rabbitmq:
image: rabbitmq:3-management-alpine
container_name: oncall-rabbitmq
environment:
- RABBITMQ_DEFAULT_USER=oncall
- RABBITMQ_DEFAULT_PASS=${RABBITMQ_PASSWORD}
volumes:
- rabbitmq-data:/var/lib/rabbitmq
networks:
- oncall-network
restart: unless-stopped
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 grafana:5 image: grafana/grafana:latest6 container_name: grafana7 environment:8 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}9 - GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=grafana-oncall-app10 - GF_INSTALL_PLUGINS=grafana-oncall-app11 volumes:12 - grafana-data:/var/lib/grafana13 ports:14 - "3000:3000"15 networks:16 - oncall-network17 restart: unless-stopped1819 oncall-engine:20 image: grafana/oncall:latest21 container_name: oncall-engine22 command: uwsgi --ini uwsgi.ini23 environment:24 - DATABASE_TYPE=postgresql25 - DATABASE_HOST=db26 - DATABASE_NAME=oncall27 - DATABASE_USER=oncall28 - DATABASE_PASSWORD=${DB_PASSWORD}29 - BROKER_TYPE=rabbitmq30 - RABBITMQ_HOST=rabbitmq31 - RABBITMQ_USERNAME=oncall32 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}33 - REDIS_URI=redis://redis:637934 - SECRET_KEY=${SECRET_KEY}35 - BASE_URL=${BASE_URL}36 - GRAFANA_API_URL=http://grafana:300037 depends_on:38 - db39 - redis40 - rabbitmq41 networks:42 - oncall-network43 restart: unless-stopped4445 oncall-celery:46 image: grafana/oncall:latest47 container_name: oncall-celery48 command: celery -A engine worker -l info -B --scheduler=django_celery_beat.schedulers:DatabaseScheduler49 environment:50 - DATABASE_TYPE=postgresql51 - DATABASE_HOST=db52 - DATABASE_NAME=oncall53 - DATABASE_USER=oncall54 - DATABASE_PASSWORD=${DB_PASSWORD}55 - BROKER_TYPE=rabbitmq56 - RABBITMQ_HOST=rabbitmq57 - RABBITMQ_USERNAME=oncall58 - RABBITMQ_PASSWORD=${RABBITMQ_PASSWORD}59 - REDIS_URI=redis://redis:637960 - SECRET_KEY=${SECRET_KEY}61 depends_on:62 - oncall-engine63 networks:64 - oncall-network65 restart: unless-stopped6667 db:68 image: postgres:15-alpine69 container_name: oncall-db70 environment:71 - POSTGRES_USER=oncall72 - POSTGRES_PASSWORD=${DB_PASSWORD}73 - POSTGRES_DB=oncall74 volumes:75 - postgres-data:/var/lib/postgresql/data76 networks:77 - oncall-network78 restart: unless-stopped7980 redis:81 image: redis:7-alpine82 container_name: oncall-redis83 volumes:84 - redis-data:/data85 networks:86 - oncall-network87 restart: unless-stopped8889 rabbitmq:90 image: rabbitmq:3-management-alpine91 container_name: oncall-rabbitmq92 environment:93 - RABBITMQ_DEFAULT_USER=oncall94 - RABBITMQ_DEFAULT_PASS=${RABBITMQ_PASSWORD}95 volumes:96 - rabbitmq-data:/var/lib/rabbitmq97 networks:98 - oncall-network99 restart: unless-stopped100101volumes:102 grafana-data:103 postgres-data:104 redis-data:105 rabbitmq-data:106107networks:108 oncall-network:109 driver: bridge110EOF111112# 2. Create the .env file113cat > .env << 'EOF'114# Grafana OnCall115BASE_URL=http://localhost:8080116GRAFANA_PASSWORD=secure_grafana_password117DB_PASSWORD=secure_oncall_password118RABBITMQ_PASSWORD=secure_rabbitmq_password119120# Generate with: openssl rand -hex 32121SECRET_KEY=your_secret_key_here122EOF123124# 3. Start the services125docker compose up -d126127# 4. View logs128docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/grafana-oncall/run | bashTroubleshooting
- OnCall plugin not visible in Grafana: Verify GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS is set and restart grafana container
- oncall-engine startup fails with database errors: Ensure db container is fully initialized before oncall-engine starts, check DATABASE_PASSWORD environment variable
- Celery workers not processing tasks: Verify rabbitmq container is running and RABBITMQ_PASSWORD matches between oncall-celery and rabbitmq services
- Notifications not being delivered: Check celery worker logs in oncall-celery container and verify external notification service credentials
- Redis connection timeouts: Increase Redis memory limits or check for memory pressure on host system affecting redis container performance
- Grafana API integration errors: Verify GRAFANA_API_URL points to correct internal service name (http://grafana:3000) and admin credentials are correct
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
grafanaoncall-engineoncall-celerypostgresqlredisrabbitmq
Tags
#oncall#alerting#grafana#incident-management#pagerduty-alternative
Category
Monitoring & ObservabilityAd Space
Shortcuts: C CopyF FavoriteD Download