docker.recipes

Modern Data Stack (ELT)

advanced

Complete ELT pipeline with Airbyte, dbt, Postgres data warehouse, and Metabase BI

Overview

Airbyte is an open-source data integration platform that emerged in 2020 to democratize ELT (Extract, Load, Transform) pipelines by providing 300+ pre-built connectors for databases, APIs, and SaaS applications. Unlike traditional ETL tools that require expensive licensing and complex configurations, Airbyte enables organizations to sync data from sources like Stripe, Salesforce, PostgreSQL, and MongoDB into data warehouses with minimal technical overhead. This modern data stack combines Airbyte's connector ecosystem with dbt's SQL-based transformation framework, PostgreSQL as a cost-effective data warehouse, and Metabase for self-service analytics. Together, these tools create a complete ELT pipeline that extracts raw data through Airbyte, loads it into PostgreSQL, transforms it using dbt's modeling capabilities, and visualizes results through Metabase dashboards. This stack appeals to data teams at startups and mid-sized companies who need enterprise-grade analytics capabilities without the complexity and cost of platforms like Snowflake, Fivetran, and Tableau. The combination delivers production-ready data pipelines that can handle millions of records while maintaining the flexibility to customize transformations and create interactive dashboards for business stakeholders.

Key Features

  • Airbyte's 300+ pre-built connectors including Stripe, Salesforce, PostgreSQL, MySQL, MongoDB, and REST APIs
  • dbt's SQL-based transformation layer with version control, testing, and documentation capabilities
  • PostgreSQL 16 data warehouse with JSONB support for semi-structured data from APIs
  • Metabase's drag-and-drop dashboard builder with 40+ visualization types and SQL query interface
  • Incremental data synchronization with configurable sync frequencies from hourly to daily
  • dbt's data lineage tracking and model dependencies for complex transformation workflows
  • Airbyte's normalization feature that automatically converts nested JSON to relational tables
  • Metabase's self-service analytics allowing business users to create reports without SQL knowledge

Common Use Cases

  • 1E-commerce companies consolidating data from Shopify, payment processors, and marketing platforms into unified customer analytics
  • 2SaaS startups building customer health dashboards by combining CRM data, product usage metrics, and billing information
  • 3Marketing teams creating attribution reports by syncing data from Google Ads, Facebook, email platforms, and web analytics
  • 4Financial teams automating monthly reporting by extracting data from accounting software, banks, and payment processors
  • 5Product teams analyzing user behavior by combining application databases with third-party analytics tools
  • 6Operations teams monitoring business KPIs by consolidating data from support tickets, inventory systems, and sales platforms
  • 7Data teams prototyping analytics solutions before investing in enterprise platforms like Snowflake and Looker

Prerequisites

  • Minimum 4GB RAM (8GB recommended) to handle Airbyte worker processes and PostgreSQL with concurrent transformations
  • Docker Engine 20.10+ with Docker Compose V2 for proper container orchestration and networking
  • Basic SQL knowledge for creating dbt models and debugging data transformation issues
  • Understanding of your source systems' API rate limits and authentication methods for Airbyte connectors
  • Available ports 3000, 5432, and 8000 for Metabase, PostgreSQL, and Airbyte web interfaces
  • At least 20GB free disk space for data warehouse storage and Airbyte's temporary processing files

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 # Airbyte - Data Integration
3 airbyte-server:
4 image: airbyte/server:latest
5 container_name: airbyte-server
6 restart: unless-stopped
7 ports:
8 - "${AIRBYTE_PORT:-8000}:8000"
9 environment:
10 - DATABASE_URL=jdbc:postgresql://airbyte-db:5432/airbyte
11 - DATABASE_USER=airbyte
12 - DATABASE_PASSWORD=${AIRBYTE_DB_PASSWORD}
13 - TRACKING_STRATEGY=segment
14 depends_on:
15 - airbyte-db
16
17 airbyte-worker:
18 image: airbyte/worker:latest
19 container_name: airbyte-worker
20 restart: unless-stopped
21 environment:
22 - DATABASE_URL=jdbc:postgresql://airbyte-db:5432/airbyte
23 - DATABASE_USER=airbyte
24 - DATABASE_PASSWORD=${AIRBYTE_DB_PASSWORD}
25 volumes:
26 - /var/run/docker.sock:/var/run/docker.sock
27 depends_on:
28 - airbyte-db
29
30 airbyte-db:
31 image: postgres:13-alpine
32 container_name: airbyte-db
33 restart: unless-stopped
34 environment:
35 - POSTGRES_USER=airbyte
36 - POSTGRES_PASSWORD=${AIRBYTE_DB_PASSWORD}
37 - POSTGRES_DB=airbyte
38 volumes:
39 - airbyte_db_data:/var/lib/postgresql/data
40
41 # Data Warehouse
42 warehouse:
43 image: postgres:16-alpine
44 container_name: data-warehouse
45 restart: unless-stopped
46 ports:
47 - "${WAREHOUSE_PORT:-5432}:5432"
48 environment:
49 - POSTGRES_USER=${WAREHOUSE_USER}
50 - POSTGRES_PASSWORD=${WAREHOUSE_PASSWORD}
51 - POSTGRES_DB=warehouse
52 volumes:
53 - warehouse_data:/var/lib/postgresql/data
54
55 # dbt for transformations
56 dbt:
57 image: ghcr.io/dbt-labs/dbt-postgres:latest
58 container_name: dbt
59 volumes:
60 - ./dbt:/usr/app
61 working_dir: /usr/app
62 environment:
63 - DBT_PROFILES_DIR=/usr/app
64 depends_on:
65 - warehouse
66 profiles:
67 - tools
68
69 # Metabase for BI
70 metabase:
71 image: metabase/metabase:latest
72 container_name: metabase
73 restart: unless-stopped
74 ports:
75 - "${METABASE_PORT:-3000}:3000"
76 environment:
77 - MB_DB_TYPE=postgres
78 - MB_DB_DBNAME=metabase
79 - MB_DB_PORT=5432
80 - MB_DB_USER=metabase
81 - MB_DB_PASS=${METABASE_DB_PASSWORD}
82 - MB_DB_HOST=metabase-db
83 depends_on:
84 - metabase-db
85 - warehouse
86
87 metabase-db:
88 image: postgres:15-alpine
89 container_name: metabase-db
90 restart: unless-stopped
91 environment:
92 - POSTGRES_USER=metabase
93 - POSTGRES_PASSWORD=${METABASE_DB_PASSWORD}
94 - POSTGRES_DB=metabase
95 volumes:
96 - metabase_db_data:/var/lib/postgresql/data
97
98volumes:
99 airbyte_db_data:
100 warehouse_data:
101 metabase_db_data:

.env Template

.env
1# Modern Data Stack
2AIRBYTE_PORT=8000
3WAREHOUSE_PORT=5432
4METABASE_PORT=3000
5
6# Airbyte
7AIRBYTE_DB_PASSWORD=airbyte_password
8
9# Data Warehouse
10WAREHOUSE_USER=warehouse
11WAREHOUSE_PASSWORD=warehouse_password
12
13# Metabase
14METABASE_DB_PASSWORD=metabase_password

Usage Notes

  1. 1Airbyte at http://localhost:8000 - configure data sources
  2. 2Metabase at http://localhost:3000 - create dashboards
  3. 3Data warehouse accessible at localhost:5432
  4. 4Run dbt: docker compose --profile tools run dbt run
  5. 5Airbyte syncs data to warehouse, dbt transforms it
  6. 6Create dbt models in ./dbt/models directory

Individual Services(7 services)

Copy individual services to mix and match with your existing compose files.

airbyte-server
airbyte-server:
  image: airbyte/server:latest
  container_name: airbyte-server
  restart: unless-stopped
  ports:
    - ${AIRBYTE_PORT:-8000}:8000
  environment:
    - DATABASE_URL=jdbc:postgresql://airbyte-db:5432/airbyte
    - DATABASE_USER=airbyte
    - DATABASE_PASSWORD=${AIRBYTE_DB_PASSWORD}
    - TRACKING_STRATEGY=segment
  depends_on:
    - airbyte-db
airbyte-worker
airbyte-worker:
  image: airbyte/worker:latest
  container_name: airbyte-worker
  restart: unless-stopped
  environment:
    - DATABASE_URL=jdbc:postgresql://airbyte-db:5432/airbyte
    - DATABASE_USER=airbyte
    - DATABASE_PASSWORD=${AIRBYTE_DB_PASSWORD}
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock
  depends_on:
    - airbyte-db
airbyte-db
airbyte-db:
  image: postgres:13-alpine
  container_name: airbyte-db
  restart: unless-stopped
  environment:
    - POSTGRES_USER=airbyte
    - POSTGRES_PASSWORD=${AIRBYTE_DB_PASSWORD}
    - POSTGRES_DB=airbyte
  volumes:
    - airbyte_db_data:/var/lib/postgresql/data
warehouse
warehouse:
  image: postgres:16-alpine
  container_name: data-warehouse
  restart: unless-stopped
  ports:
    - ${WAREHOUSE_PORT:-5432}:5432
  environment:
    - POSTGRES_USER=${WAREHOUSE_USER}
    - POSTGRES_PASSWORD=${WAREHOUSE_PASSWORD}
    - POSTGRES_DB=warehouse
  volumes:
    - warehouse_data:/var/lib/postgresql/data
dbt
dbt:
  image: ghcr.io/dbt-labs/dbt-postgres:latest
  container_name: dbt
  volumes:
    - ./dbt:/usr/app
  working_dir: /usr/app
  environment:
    - DBT_PROFILES_DIR=/usr/app
  depends_on:
    - warehouse
  profiles:
    - tools
metabase
metabase:
  image: metabase/metabase:latest
  container_name: metabase
  restart: unless-stopped
  ports:
    - ${METABASE_PORT:-3000}:3000
  environment:
    - MB_DB_TYPE=postgres
    - MB_DB_DBNAME=metabase
    - MB_DB_PORT=5432
    - MB_DB_USER=metabase
    - MB_DB_PASS=${METABASE_DB_PASSWORD}
    - MB_DB_HOST=metabase-db
  depends_on:
    - metabase-db
    - warehouse
metabase-db
metabase-db:
  image: postgres:15-alpine
  container_name: metabase-db
  restart: unless-stopped
  environment:
    - POSTGRES_USER=metabase
    - POSTGRES_PASSWORD=${METABASE_DB_PASSWORD}
    - POSTGRES_DB=metabase
  volumes:
    - metabase_db_data:/var/lib/postgresql/data

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 # Airbyte - Data Integration
5 airbyte-server:
6 image: airbyte/server:latest
7 container_name: airbyte-server
8 restart: unless-stopped
9 ports:
10 - "${AIRBYTE_PORT:-8000}:8000"
11 environment:
12 - DATABASE_URL=jdbc:postgresql://airbyte-db:5432/airbyte
13 - DATABASE_USER=airbyte
14 - DATABASE_PASSWORD=${AIRBYTE_DB_PASSWORD}
15 - TRACKING_STRATEGY=segment
16 depends_on:
17 - airbyte-db
18
19 airbyte-worker:
20 image: airbyte/worker:latest
21 container_name: airbyte-worker
22 restart: unless-stopped
23 environment:
24 - DATABASE_URL=jdbc:postgresql://airbyte-db:5432/airbyte
25 - DATABASE_USER=airbyte
26 - DATABASE_PASSWORD=${AIRBYTE_DB_PASSWORD}
27 volumes:
28 - /var/run/docker.sock:/var/run/docker.sock
29 depends_on:
30 - airbyte-db
31
32 airbyte-db:
33 image: postgres:13-alpine
34 container_name: airbyte-db
35 restart: unless-stopped
36 environment:
37 - POSTGRES_USER=airbyte
38 - POSTGRES_PASSWORD=${AIRBYTE_DB_PASSWORD}
39 - POSTGRES_DB=airbyte
40 volumes:
41 - airbyte_db_data:/var/lib/postgresql/data
42
43 # Data Warehouse
44 warehouse:
45 image: postgres:16-alpine
46 container_name: data-warehouse
47 restart: unless-stopped
48 ports:
49 - "${WAREHOUSE_PORT:-5432}:5432"
50 environment:
51 - POSTGRES_USER=${WAREHOUSE_USER}
52 - POSTGRES_PASSWORD=${WAREHOUSE_PASSWORD}
53 - POSTGRES_DB=warehouse
54 volumes:
55 - warehouse_data:/var/lib/postgresql/data
56
57 # dbt for transformations
58 dbt:
59 image: ghcr.io/dbt-labs/dbt-postgres:latest
60 container_name: dbt
61 volumes:
62 - ./dbt:/usr/app
63 working_dir: /usr/app
64 environment:
65 - DBT_PROFILES_DIR=/usr/app
66 depends_on:
67 - warehouse
68 profiles:
69 - tools
70
71 # Metabase for BI
72 metabase:
73 image: metabase/metabase:latest
74 container_name: metabase
75 restart: unless-stopped
76 ports:
77 - "${METABASE_PORT:-3000}:3000"
78 environment:
79 - MB_DB_TYPE=postgres
80 - MB_DB_DBNAME=metabase
81 - MB_DB_PORT=5432
82 - MB_DB_USER=metabase
83 - MB_DB_PASS=${METABASE_DB_PASSWORD}
84 - MB_DB_HOST=metabase-db
85 depends_on:
86 - metabase-db
87 - warehouse
88
89 metabase-db:
90 image: postgres:15-alpine
91 container_name: metabase-db
92 restart: unless-stopped
93 environment:
94 - POSTGRES_USER=metabase
95 - POSTGRES_PASSWORD=${METABASE_DB_PASSWORD}
96 - POSTGRES_DB=metabase
97 volumes:
98 - metabase_db_data:/var/lib/postgresql/data
99
100volumes:
101 airbyte_db_data:
102 warehouse_data:
103 metabase_db_data:
104EOF
105
106# 2. Create the .env file
107cat > .env << 'EOF'
108# Modern Data Stack
109AIRBYTE_PORT=8000
110WAREHOUSE_PORT=5432
111METABASE_PORT=3000
112
113# Airbyte
114AIRBYTE_DB_PASSWORD=airbyte_password
115
116# Data Warehouse
117WAREHOUSE_USER=warehouse
118WAREHOUSE_PASSWORD=warehouse_password
119
120# Metabase
121METABASE_DB_PASSWORD=metabase_password
122EOF
123
124# 3. Start the services
125docker compose up -d
126
127# 4. View logs
128docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/modern-data-stack-elt/run | bash

Troubleshooting

  • Airbyte sync fails with 'Connection timeout': Increase connector timeout settings in Airbyte UI and verify source system API limits
  • dbt run fails with 'relation does not exist': Check Airbyte normalization completed successfully and verify table names in PostgreSQL warehouse schema
  • Metabase shows 'Database connection error': Ensure warehouse container is fully started and credentials in MB_DB_* environment variables match PostgreSQL settings
  • Airbyte worker container exits with code 137: Increase Docker memory allocation to 6GB+ as Airbyte workers are memory-intensive during large syncs
  • dbt models fail with 'permission denied': Grant proper SELECT/CREATE permissions to warehouse user on public schema and source tables
  • High memory usage during syncs: Configure Airbyte connector batch sizes and enable incremental sync modes to reduce memory footprint

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space