docker.recipes

Label Studio + PostgreSQL + Redis

intermediate

Data labeling platform with enterprise features for ML training data.

Overview

Label Studio is an open-source data labeling platform developed by Heartex that enables teams to create high-quality training datasets for machine learning projects. Originally launched in 2019, Label Studio has become a leading solution for annotating text, images, audio, video, and time series data, supporting over 20 different ML tasks from text classification to object detection. The platform provides a web-based interface where teams can configure custom labeling templates, manage annotator workflows, and export labeled data in formats compatible with popular ML frameworks like TensorFlow, PyTorch, and scikit-learn. This enterprise-grade stack combines Label Studio with PostgreSQL for robust data persistence and Redis for high-performance caching and session management. PostgreSQL stores all project configurations, user data, annotations, and metadata with ACID compliance, ensuring data integrity even during concurrent labeling sessions. Redis accelerates the platform by caching frequently accessed data, managing real-time collaboration features, and handling background task queues for data import/export operations. NGINX serves as the reverse proxy, efficiently delivering static assets like images and documents while load balancing requests to the Label Studio application server. Data science teams, ML engineers, and organizations building custom AI models should deploy this stack when they need a scalable, production-ready annotation platform. Unlike basic labeling tools, this configuration supports enterprise features like user authentication, project-based access control, annotation quality metrics, and API integrations. The PostgreSQL backend enables complex queries for annotation analytics and audit trails, while Redis ensures responsive performance even with large datasets and multiple concurrent annotators working on the same projects.

Key Features

  • Multi-format data support with configurable labeling interfaces for text, images, audio, video, HTML, and time series data
  • PostgreSQL-backed project management with user roles, annotation versioning, and detailed audit trails
  • Redis-powered real-time collaboration enabling multiple annotators to work simultaneously without conflicts
  • NGINX-optimized file serving for large media assets with efficient caching and compression
  • Advanced annotation quality controls including inter-annotator agreement scoring and review workflows
  • API-first architecture with RESTful endpoints for automated data import, export, and integration with ML pipelines
  • Flexible labeling templates supporting custom taxonomies, multi-class classification, named entity recognition, and bounding box annotations
  • Enterprise authentication integration with LDAP, SSO, and role-based access control for secure multi-team environments

Common Use Cases

  • 1Computer vision teams annotating large image datasets for object detection, semantic segmentation, and facial recognition models
  • 2NLP projects requiring text classification, named entity recognition, sentiment analysis, and document categorization
  • 3Healthcare organizations labeling medical images, patient records, and clinical trial data with strict compliance requirements
  • 4Autonomous vehicle companies creating training datasets for road sign detection, lane marking, and obstacle identification
  • 5Content moderation teams at social media platforms labeling text posts, images, and videos for harmful content detection
  • 6Research institutions conducting longitudinal studies requiring consistent data annotation across multiple researchers
  • 7E-commerce companies building recommendation systems by labeling product images, descriptions, and customer review sentiment

Prerequisites

  • Minimum 2GB RAM (4GB+ recommended) to support PostgreSQL database operations and Label Studio's web interface
  • Docker Engine 20.10+ and Docker Compose 2.0+ for container orchestration and volume management
  • Available ports 80, 8080, 5432, and 6379 for NGINX, Label Studio, PostgreSQL, and Redis respectively
  • Understanding of data annotation workflows and familiarity with ML training data requirements
  • Basic knowledge of PostgreSQL administration for database maintenance and backup procedures
  • Storage capacity planning based on dataset size - PostgreSQL data, Redis memory, and Label Studio file storage requirements

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 label-studio:
3 image: heartexlabs/label-studio:latest
4 environment:
5 - DJANGO_DB=default
6 - POSTGRE_HOST=postgres
7 - POSTGRE_PORT=5432
8 - POSTGRE_NAME=labelstudio
9 - POSTGRE_USER=${POSTGRES_USER}
10 - POSTGRE_PASSWORD=${POSTGRES_PASSWORD}
11 - REDIS_HOST=redis
12 - REDIS_PORT=6379
13 - LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
14 - LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/files
15 volumes:
16 - ls-data:/label-studio/data
17 - ls-files:/label-studio/files
18 ports:
19 - "8080:8080"
20 depends_on:
21 - postgres
22 - redis
23 networks:
24 - labelstudio-network
25 restart: unless-stopped
26
27 postgres:
28 image: postgres:15
29 environment:
30 - POSTGRES_USER=${POSTGRES_USER}
31 - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
32 - POSTGRES_DB=labelstudio
33 volumes:
34 - postgres-data:/var/lib/postgresql/data
35 networks:
36 - labelstudio-network
37 restart: unless-stopped
38
39 redis:
40 image: redis:alpine
41 volumes:
42 - redis-data:/data
43 networks:
44 - labelstudio-network
45 restart: unless-stopped
46
47 nginx:
48 image: nginx:alpine
49 volumes:
50 - ./nginx.conf:/etc/nginx/nginx.conf:ro
51 - ls-files:/files:ro
52 ports:
53 - "80:80"
54 depends_on:
55 - label-studio
56 networks:
57 - labelstudio-network
58 restart: unless-stopped
59
60volumes:
61 ls-data:
62 ls-files:
63 postgres-data:
64 redis-data:
65
66networks:
67 labelstudio-network:
68 driver: bridge

.env Template

.env
1# Label Studio
2POSTGRES_USER=labelstudio
3POSTGRES_PASSWORD=secure_postgres_password
4
5# Create superuser on first run:
6# docker exec -it label-studio label-studio user create

Usage Notes

  1. 1Label Studio at http://localhost:8080
  2. 2Create first user via signup
  3. 3Import data from local files or cloud
  4. 4Support for text, image, audio, video
  5. 5Export in multiple formats

Individual Services(4 services)

Copy individual services to mix and match with your existing compose files.

label-studio
label-studio:
  image: heartexlabs/label-studio:latest
  environment:
    - DJANGO_DB=default
    - POSTGRE_HOST=postgres
    - POSTGRE_PORT=5432
    - POSTGRE_NAME=labelstudio
    - POSTGRE_USER=${POSTGRES_USER}
    - POSTGRE_PASSWORD=${POSTGRES_PASSWORD}
    - REDIS_HOST=redis
    - REDIS_PORT=6379
    - LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
    - LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/files
  volumes:
    - ls-data:/label-studio/data
    - ls-files:/label-studio/files
  ports:
    - "8080:8080"
  depends_on:
    - postgres
    - redis
  networks:
    - labelstudio-network
  restart: unless-stopped
postgres
postgres:
  image: postgres:15
  environment:
    - POSTGRES_USER=${POSTGRES_USER}
    - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
    - POSTGRES_DB=labelstudio
  volumes:
    - postgres-data:/var/lib/postgresql/data
  networks:
    - labelstudio-network
  restart: unless-stopped
redis
redis:
  image: redis:alpine
  volumes:
    - redis-data:/data
  networks:
    - labelstudio-network
  restart: unless-stopped
nginx
nginx:
  image: nginx:alpine
  volumes:
    - ./nginx.conf:/etc/nginx/nginx.conf:ro
    - ls-files:/files:ro
  ports:
    - "80:80"
  depends_on:
    - label-studio
  networks:
    - labelstudio-network
  restart: unless-stopped

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 label-studio:
5 image: heartexlabs/label-studio:latest
6 environment:
7 - DJANGO_DB=default
8 - POSTGRE_HOST=postgres
9 - POSTGRE_PORT=5432
10 - POSTGRE_NAME=labelstudio
11 - POSTGRE_USER=${POSTGRES_USER}
12 - POSTGRE_PASSWORD=${POSTGRES_PASSWORD}
13 - REDIS_HOST=redis
14 - REDIS_PORT=6379
15 - LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
16 - LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/files
17 volumes:
18 - ls-data:/label-studio/data
19 - ls-files:/label-studio/files
20 ports:
21 - "8080:8080"
22 depends_on:
23 - postgres
24 - redis
25 networks:
26 - labelstudio-network
27 restart: unless-stopped
28
29 postgres:
30 image: postgres:15
31 environment:
32 - POSTGRES_USER=${POSTGRES_USER}
33 - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
34 - POSTGRES_DB=labelstudio
35 volumes:
36 - postgres-data:/var/lib/postgresql/data
37 networks:
38 - labelstudio-network
39 restart: unless-stopped
40
41 redis:
42 image: redis:alpine
43 volumes:
44 - redis-data:/data
45 networks:
46 - labelstudio-network
47 restart: unless-stopped
48
49 nginx:
50 image: nginx:alpine
51 volumes:
52 - ./nginx.conf:/etc/nginx/nginx.conf:ro
53 - ls-files:/files:ro
54 ports:
55 - "80:80"
56 depends_on:
57 - label-studio
58 networks:
59 - labelstudio-network
60 restart: unless-stopped
61
62volumes:
63 ls-data:
64 ls-files:
65 postgres-data:
66 redis-data:
67
68networks:
69 labelstudio-network:
70 driver: bridge
71EOF
72
73# 2. Create the .env file
74cat > .env << 'EOF'
75# Label Studio
76POSTGRES_USER=labelstudio
77POSTGRES_PASSWORD=secure_postgres_password
78
79# Create superuser on first run:
80# docker exec -it label-studio label-studio user create
81EOF
82
83# 3. Start the services
84docker compose up -d
85
86# 4. View logs
87docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/label-studio-enterprise/run | bash

Troubleshooting

  • Label Studio shows 'Database connection failed': Check POSTGRES_USER and POSTGRES_PASSWORD environment variables match between services
  • Redis connection timeout errors: Increase Redis memory limit or check if Redis container has sufficient RAM allocation
  • File upload failures with large media assets: Configure NGINX client_max_body_size and Label Studio's LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED setting
  • PostgreSQL 'too many connections' error: Adjust max_connections in PostgreSQL config or implement connection pooling in Label Studio
  • Slow annotation interface loading: Verify Redis is properly connected and consider increasing Redis memory or using Redis persistence
  • NGINX 502 Bad Gateway errors: Ensure Label Studio container is fully started before NGINX attempts to proxy requests

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space