Apache Airflow with Celery
Apache Airflow workflow orchestration with Celery executor and Redis broker.
Overview
Apache Airflow is a platform for programmatically authoring, scheduling, and monitoring workflows using Python DAG definitions. Originally developed at Airbnb and later open-sourced, Airflow has become the industry standard for orchestrating complex data pipelines and ETL processes. The Celery executor enables distributed task execution across multiple worker nodes, making it suitable for high-throughput production environments where tasks need to be processed in parallel across different machines or containers.
This stack combines Airflow's web server, scheduler, and Celery workers with Redis as the message broker and PostgreSQL as the metadata database. Redis handles task queuing and result caching between the scheduler and workers, while PostgreSQL stores DAG metadata, task history, and execution logs. The Celery executor allows tasks to be distributed across multiple worker processes, enabling horizontal scaling and fault tolerance that the default SequentialExecutor cannot provide.
Data engineers and DevOps teams running complex ETL pipelines, machine learning workflows, or batch processing jobs benefit from this configuration. The distributed architecture supports organizations processing hundreds of daily workflows with varying resource requirements. Companies migrating from cron jobs or legacy schedulers find this setup provides better visibility, dependency management, and failure recovery compared to traditional approaches.
Key Features
- Python DAG definition with rich scheduling expressions and cron syntax
- Distributed task execution using Celery workers across multiple containers
- Web-based UI for monitoring workflow status, logs, and task dependencies
- Redis message broker for fast task queuing and result backend caching
- PostgreSQL metadata database with ACID compliance for workflow state
- Flower monitoring interface for real-time Celery worker and queue visibility
- Task retry logic with exponential backoff and SLA monitoring
- Dynamic pipeline generation and templating with Jinja2 support
Common Use Cases
- 1Daily ETL pipelines processing data warehouse loads from multiple sources
- 2Machine learning model training workflows with GPU resource scheduling
- 3Data quality monitoring pipelines with automated alerting and remediation
- 4Multi-cloud data synchronization between AWS S3, Google Cloud, and Azure
- 5Financial reporting automation with regulatory compliance requirements
- 6IoT sensor data processing with real-time aggregation and anomaly detection
- 7Marketing campaign automation with customer segmentation and email triggers
Prerequisites
- 8GB+ RAM recommended for production workloads (4GB minimum for development)
- Docker and Docker Compose with container orchestration knowledge
- Python programming experience for writing DAGs and custom operators
- Understanding of Celery distributed task queues and worker management
- PostgreSQL administration skills for database maintenance and backups
- Network access to ports 8080 (Airflow UI) and 5555 (Flower monitoring)
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 postgres: 3 image: postgres:16-alpine4 container_name: airflow-db5 environment: 6 - POSTGRES_USER=airflow7 - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}8 - POSTGRES_DB=airflow9 volumes: 10 - postgres_data:/var/lib/postgresql/data11 healthcheck: 12 test: ["CMD-SHELL", "pg_isready -U airflow"]13 interval: 10s14 timeout: 5s15 retries: 516 networks: 17 - airflow-network1819 redis: 20 image: redis:7-alpine21 container_name: airflow-redis22 healthcheck: 23 test: ["CMD", "redis-cli", "ping"]24 interval: 10s25 timeout: 5s26 retries: 527 networks: 28 - airflow-network2930 airflow-init: 31 image: apache/airflow:2.8.032 container_name: airflow-init33 entrypoint: /bin/bash34 command: 35 - -c36 - |37 airflow db init38 airflow users create --username admin --password ${AIRFLOW_ADMIN_PASSWORD} --firstname Admin --lastname User --role Admin --email admin@example.com39 environment: 40 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor41 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow42 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/043 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow44 depends_on: 45 postgres: 46 condition: service_healthy47 redis: 48 condition: service_healthy49 networks: 50 - airflow-network5152 airflow-webserver: 53 image: apache/airflow:2.8.054 container_name: airflow-webserver55 command: webserver56 environment: 57 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor58 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow59 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/060 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow61 - AIRFLOW__WEBSERVER__SECRET_KEY=${AIRFLOW_SECRET_KEY}62 ports: 63 - "8080:8080"64 volumes: 65 - ./dags:/opt/airflow/dags66 - ./logs:/opt/airflow/logs67 - ./plugins:/opt/airflow/plugins68 depends_on: 69 airflow-init: 70 condition: service_completed_successfully71 networks: 72 - airflow-network7374 airflow-scheduler: 75 image: apache/airflow:2.8.076 container_name: airflow-scheduler77 command: scheduler78 environment: 79 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor80 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow81 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/082 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow83 volumes: 84 - ./dags:/opt/airflow/dags85 - ./logs:/opt/airflow/logs86 - ./plugins:/opt/airflow/plugins87 depends_on: 88 airflow-init: 89 condition: service_completed_successfully90 networks: 91 - airflow-network9293 airflow-worker: 94 image: apache/airflow:2.8.095 container_name: airflow-worker96 command: celery worker97 environment: 98 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor99 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow100 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0101 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow102 volumes: 103 - ./dags:/opt/airflow/dags104 - ./logs:/opt/airflow/logs105 - ./plugins:/opt/airflow/plugins106 depends_on: 107 airflow-init: 108 condition: service_completed_successfully109 networks: 110 - airflow-network111112 flower: 113 image: apache/airflow:2.8.0114 container_name: airflow-flower115 command: celery flower116 environment: 117 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor118 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow119 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0120 ports: 121 - "5555:5555"122 depends_on: 123 airflow-init: 124 condition: service_completed_successfully125 networks: 126 - airflow-network127128volumes: 129 postgres_data: 130131networks: 132 airflow-network: 133 driver: bridge.env Template
.env
1# Apache Airflow2POSTGRES_PASSWORD=airflow_db_password3AIRFLOW_ADMIN_PASSWORD=admin1234AIRFLOW_SECRET_KEY=your-secret-key-hereUsage Notes
- 1Airflow UI at http://localhost:8080
- 2Flower (Celery monitor) at http://localhost:5555
- 3Place DAGs in ./dags directory
- 4Scale workers with docker compose scale
- 5Create dags, logs, plugins directories first
Individual Services(7 services)
Copy individual services to mix and match with your existing compose files.
postgres
postgres:
image: postgres:16-alpine
container_name: airflow-db
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=airflow
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test:
- CMD-SHELL
- pg_isready -U airflow
interval: 10s
timeout: 5s
retries: 5
networks:
- airflow-network
redis
redis:
image: redis:7-alpine
container_name: airflow-redis
healthcheck:
test:
- CMD
- redis-cli
- ping
interval: 10s
timeout: 5s
retries: 5
networks:
- airflow-network
airflow-init
airflow-init:
image: apache/airflow:2.8.0
container_name: airflow-init
entrypoint: /bin/bash
command:
- "-c"
- |
airflow db init
airflow users create --username admin --password ${AIRFLOW_ADMIN_PASSWORD} --firstname Admin --lastname User --role Admin --email admin@example.com
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow
- AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
networks:
- airflow-network
airflow-webserver
airflow-webserver:
image: apache/airflow:2.8.0
container_name: airflow-webserver
command: webserver
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow
- AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow
- AIRFLOW__WEBSERVER__SECRET_KEY=${AIRFLOW_SECRET_KEY}
ports:
- "8080:8080"
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
depends_on:
airflow-init:
condition: service_completed_successfully
networks:
- airflow-network
airflow-scheduler
airflow-scheduler:
image: apache/airflow:2.8.0
container_name: airflow-scheduler
command: scheduler
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow
- AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
depends_on:
airflow-init:
condition: service_completed_successfully
networks:
- airflow-network
airflow-worker
airflow-worker:
image: apache/airflow:2.8.0
container_name: airflow-worker
command: celery worker
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow
- AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0
- AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
depends_on:
airflow-init:
condition: service_completed_successfully
networks:
- airflow-network
flower
flower:
image: apache/airflow:2.8.0
container_name: airflow-flower
command: celery flower
environment:
- AIRFLOW__CORE__EXECUTOR=CeleryExecutor
- AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow
- AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0
ports:
- "5555:5555"
depends_on:
airflow-init:
condition: service_completed_successfully
networks:
- airflow-network
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 postgres:5 image: postgres:16-alpine6 container_name: airflow-db7 environment:8 - POSTGRES_USER=airflow9 - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}10 - POSTGRES_DB=airflow11 volumes:12 - postgres_data:/var/lib/postgresql/data13 healthcheck:14 test: ["CMD-SHELL", "pg_isready -U airflow"]15 interval: 10s16 timeout: 5s17 retries: 518 networks:19 - airflow-network2021 redis:22 image: redis:7-alpine23 container_name: airflow-redis24 healthcheck:25 test: ["CMD", "redis-cli", "ping"]26 interval: 10s27 timeout: 5s28 retries: 529 networks:30 - airflow-network3132 airflow-init:33 image: apache/airflow:2.8.034 container_name: airflow-init35 entrypoint: /bin/bash36 command:37 - -c38 - |39 airflow db init40 airflow users create --username admin --password ${AIRFLOW_ADMIN_PASSWORD} --firstname Admin --lastname User --role Admin --email admin@example.com41 environment:42 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor43 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow44 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/045 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow46 depends_on:47 postgres:48 condition: service_healthy49 redis:50 condition: service_healthy51 networks:52 - airflow-network5354 airflow-webserver:55 image: apache/airflow:2.8.056 container_name: airflow-webserver57 command: webserver58 environment:59 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor60 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow61 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/062 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow63 - AIRFLOW__WEBSERVER__SECRET_KEY=${AIRFLOW_SECRET_KEY}64 ports:65 - "8080:8080"66 volumes:67 - ./dags:/opt/airflow/dags68 - ./logs:/opt/airflow/logs69 - ./plugins:/opt/airflow/plugins70 depends_on:71 airflow-init:72 condition: service_completed_successfully73 networks:74 - airflow-network7576 airflow-scheduler:77 image: apache/airflow:2.8.078 container_name: airflow-scheduler79 command: scheduler80 environment:81 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor82 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow83 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/084 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow85 volumes:86 - ./dags:/opt/airflow/dags87 - ./logs:/opt/airflow/logs88 - ./plugins:/opt/airflow/plugins89 depends_on:90 airflow-init:91 condition: service_completed_successfully92 networks:93 - airflow-network9495 airflow-worker:96 image: apache/airflow:2.8.097 container_name: airflow-worker98 command: celery worker99 environment:100 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor101 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow102 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0103 - AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql://airflow:${POSTGRES_PASSWORD}@postgres/airflow104 volumes:105 - ./dags:/opt/airflow/dags106 - ./logs:/opt/airflow/logs107 - ./plugins:/opt/airflow/plugins108 depends_on:109 airflow-init:110 condition: service_completed_successfully111 networks:112 - airflow-network113114 flower:115 image: apache/airflow:2.8.0116 container_name: airflow-flower117 command: celery flower118 environment:119 - AIRFLOW__CORE__EXECUTOR=CeleryExecutor120 - AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:${POSTGRES_PASSWORD}@postgres/airflow121 - AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/0122 ports:123 - "5555:5555"124 depends_on:125 airflow-init:126 condition: service_completed_successfully127 networks:128 - airflow-network129130volumes:131 postgres_data:132133networks:134 airflow-network:135 driver: bridge136EOF137138# 2. Create the .env file139cat > .env << 'EOF'140# Apache Airflow141POSTGRES_PASSWORD=airflow_db_password142AIRFLOW_ADMIN_PASSWORD=admin123143AIRFLOW_SECRET_KEY=your-secret-key-here144EOF145146# 3. Start the services147docker compose up -d148149# 4. View logs150docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/airflow-celery-redis/run | bashTroubleshooting
- ImportError when loading DAGs: Ensure custom Python packages are installed in all Airflow containers using requirements.txt or custom Docker images
- Tasks stuck in queued state: Check Redis connectivity and verify Celery workers are running with docker compose logs airflow-worker
- Scheduler not picking up new DAGs: Restart airflow-scheduler container and verify DAG files have correct Python syntax and are in /opt/airflow/dags
- Database connection errors: Verify PostgreSQL container is healthy and POSTGRES_PASSWORD environment variable matches across all services
- Worker memory issues: Increase Docker container memory limits or reduce AIRFLOW__CELERY__WORKER_CONCURRENCY to prevent OOM kills
- Web UI 403 Forbidden errors: Check AIRFLOW__WEBSERVER__SECRET_KEY is set and consistent across restarts, regenerate if corrupted
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
airflow-webserverairflow-schedulerairflow-workerpostgresredis
Tags
#airflow#workflow#orchestration#celery#etl#scheduling
Category
DevOps & CI/CDAd Space
Shortcuts: C CopyF FavoriteD Download