01The Problem with depends_on
Docker Compose's depends_on only waits for a container to **start**, not for the application inside to be **ready**. Your web app might crash because it tried to connect to PostgreSQL before the database finished initializing. Healthchecks solve this by letting you define what 'ready' actually means.
Without healthchecks, depends_on is nearly useless for production. A started container doesn't mean a ready service.
02Writing Effective Healthchecks
A healthcheck runs a command inside your container at regular intervals. If the command exits with 0, the container is healthy. If it exits with 1, it's unhealthy. Docker tracks this state and can restart unhealthy containers or delay dependent services.
1services: 2 db: 3 image: postgres:16-alpine4 healthcheck: 5 test: ["CMD-SHELL", "pg_isready -U postgres"]6 interval: 10s # Check every 10 seconds7 timeout: 5s # Wait max 5s for response8 retries: 5 # Fail after 5 consecutive failures9 start_period: 30s # Grace period for slow startups10 environment: 11 POSTGRES_PASSWORD: ${DB_PASSWORD}start_period gives slow-starting services time to initialize before healthchecks count as failures.
03Using condition: service_healthy
The magic happens when you combine healthchecks with depends_on conditions. Instead of just starting in order, Docker will wait until the dependency is actually healthy before starting the dependent service.
1services: 2 app: 3 image: myapp:latest4 depends_on: 5 db: 6 condition: service_healthy # Wait for healthy7 redis: 8 condition: service_started # Just wait for start9 migrations: 10 condition: service_completed_successfully # Wait for exit 01112 db: 13 image: postgres:16-alpine14 healthcheck: 15 test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]16 interval: 5s17 timeout: 5s18 retries: 51920 redis: 21 image: redis:alpine22 healthcheck: 23 test: ["CMD", "redis-cli", "ping"]24 interval: 5s25 timeout: 3s26 retries: 32728 migrations: 29 image: myapp:latest30 command: python manage.py migrate31 depends_on: 32 db: 33 condition: service_healthy04Healthcheck Patterns for Common Services
Different services need different healthcheck strategies. Here are battle-tested patterns for the most common self-hosted services.
1# PostgreSQL2healthcheck: 3 test: ["CMD-SHELL", "pg_isready -U postgres"]4 interval: 10s5 timeout: 5s6 retries: 578# MySQL/MariaDB9healthcheck: 10 test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]11 interval: 10s12 timeout: 5s13 retries: 51415# Redis16healthcheck: 17 test: ["CMD", "redis-cli", "ping"]18 interval: 5s19 timeout: 3s20 retries: 32122# MongoDB23healthcheck: 24 test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]25 interval: 10s26 timeout: 5s27 retries: 52829# HTTP services (generic)30healthcheck: 31 test: ["CMD", "curl", "-f", "http://localhost:8080/health"]32 interval: 30s33 timeout: 10s34 retries: 33536# Nginx37healthcheck: 38 test: ["CMD", "curl", "-f", "http://localhost/nginx-health"]39 interval: 30s40 timeout: 5s41 retries: 3For HTTP healthchecks, create a lightweight /health endpoint that checks database connectivity and returns 200 OK.
05Complex Startup Sequences
Real applications often have multi-stage startup: database first, then migrations, then cache warmup, finally the app. Model this as a dependency chain with appropriate conditions.
1services: 2 # Stage 1: Database3 postgres: 4 image: postgres:16-alpine5 healthcheck: 6 test: ["CMD-SHELL", "pg_isready"]7 interval: 5s8 timeout: 5s9 retries: 51011 # Stage 2: Run migrations (one-time task)12 migrations: 13 image: myapp:latest14 command: ["./manage.py", "migrate", "--noinput"]15 depends_on: 16 postgres: 17 condition: service_healthy18 restart: "no" # Don't restart after success1920 # Stage 3: Seed data (optional, one-time)21 seed: 22 image: myapp:latest23 command: ["./manage.py", "loaddata", "initial_data.json"]24 depends_on: 25 migrations: 26 condition: service_completed_successfully27 restart: "no"2829 # Stage 4: Application30 app: 31 image: myapp:latest32 command: ["gunicorn", "app:application"]33 depends_on: 34 postgres: 35 condition: service_healthy36 migrations: 37 condition: service_completed_successfully38 healthcheck: 39 test: ["CMD", "curl", "-f", "http://localhost:8000/health"]40 interval: 30s41 timeout: 10s42 retries: 3service_completed_successfully only works for containers that exit. Don't use it for long-running services.
06Debugging Healthcheck Issues
When healthchecks fail mysteriously, use these commands to diagnose the problem.
1# Check current health status2docker inspect --format='{{.State.Health.Status}}' container_name34# View healthcheck logs (last 5 checks)5docker inspect --format='{{json .State.Health}}' container_name | jq67# Run the healthcheck manually8docker exec container_name pg_isready -U postgres910# Watch health status in real-time11watch -n 2 "docker ps --format 'table {{.Names}}\t{{.Status}}'"1213# Check why a container is unhealthy14docker inspect container_name | jq '.[0].State.Health.Log'If a healthcheck works manually but fails in Docker, check for missing tools in the container or PATH issues.
07Healthcheck Best Practices
Follow these guidelines to write reliable healthchecks that improve your stack's resilience:
**1. Keep healthchecks fast** - They run frequently; slow checks waste resources
**2. Check what matters** - A database healthcheck should verify it can accept connections, not just that the process is running
**3. Use start_period** - Give slow services (Elasticsearch, Java apps) time to initialize
**4. Don't healthcheck everything** - Simple, stateless containers often don't need them
**5. Match intervals to service criticality** - Critical services: 5-10s, less critical: 30-60s