docker.recipes

Label Studio ML Annotation

intermediate

Label Studio for data labeling with ML backend.

Overview

Label Studio is an open-source data labeling platform developed by HumanSignal that enables teams to annotate various data types including text, images, audio, video, and time series data. Originally created to address the bottleneck of high-quality data preparation in machine learning workflows, Label Studio has evolved into a comprehensive annotation platform that supports over 30 data types and integrates with popular ML frameworks. It provides a flexible interface for creating custom labeling configurations and supports collaborative annotation workflows with quality control features. This deployment consists of three interconnected services: the main Label Studio application server, a PostgreSQL database for persistent data storage, and a dedicated ML backend service for automated pre-annotations. The Label Studio service runs the core web application and API, while the PostgreSQL database stores project configurations, user data, annotations, and metadata. The ML backend service operates independently to provide machine learning-powered suggestions and active learning capabilities, connecting to the main Label Studio instance via HTTP API calls. This configuration is ideal for data science teams, AI researchers, and organizations looking to implement production-grade data labeling workflows with machine learning acceleration. The combination of Label Studio's versatile annotation interface with an integrated ML backend creates a powerful active learning environment where models can pre-annotate data and continuously improve based on human feedback, significantly reducing manual labeling effort while maintaining annotation quality.

Key Features

  • Multi-format data annotation supporting text, images, audio, video, and time series with customizable labeling interfaces
  • PostgreSQL-backed persistent storage for annotations, projects, and user management with ACID compliance
  • Integrated ML backend service providing automated pre-annotations and active learning capabilities
  • Export functionality to popular ML formats including COCO, YOLO, Pascal VOC, CONLL, and JSON
  • Web-based collaborative annotation interface with user roles, task assignment, and quality control
  • RESTful API for programmatic access to projects, tasks, and annotations
  • Custom labeling configuration templates using XML-based markup for domain-specific workflows
  • Real-time model training integration with the ability to retrain models based on new annotations

Common Use Cases

  • 1Computer vision projects requiring object detection, image segmentation, or classification annotations with YOLO/COCO export
  • 2Natural language processing tasks including named entity recognition, sentiment analysis, and text classification
  • 3Active learning workflows where ML models iteratively improve by suggesting labels for human verification
  • 4Multi-annotator projects requiring consensus building and inter-annotator agreement measurement
  • 5Audio and video content labeling for speech recognition, sound classification, or video analysis projects
  • 6Research environments where custom annotation schemas need to be rapidly prototyped and deployed
  • 7Production ML pipelines requiring continuous data labeling and model retraining cycles

Prerequisites

  • Docker and Docker Compose installed with minimum 2GB RAM available for the three-service stack
  • Port 8080 available for Label Studio web interface and port 9090 for ML backend API
  • Database password configured in environment variables for PostgreSQL authentication
  • Basic understanding of data annotation workflows and machine learning model integration concepts
  • Familiarity with Label Studio's XML-based labeling configuration syntax for custom interfaces
  • Knowledge of target export formats (COCO, YOLO, etc.) if integrating with specific ML frameworks

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 label-studio:
3 image: heartexlabs/label-studio:latest
4 container_name: label-studio
5 restart: unless-stopped
6 ports:
7 - "${LS_PORT:-8080}:8080"
8 environment:
9 - DJANGO_DB=default
10 - POSTGRE_HOST=label-studio-db
11 - POSTGRE_NAME=labelstudio
12 - POSTGRE_USER=labelstudio
13 - POSTGRE_PASSWORD=${DB_PASSWORD}
14 volumes:
15 - ls_data:/label-studio/data
16 depends_on:
17 - label-studio-db
18
19 label-studio-db:
20 image: postgres:15-alpine
21 container_name: label-studio-db
22 restart: unless-stopped
23 environment:
24 - POSTGRES_USER=labelstudio
25 - POSTGRES_PASSWORD=${DB_PASSWORD}
26 - POSTGRES_DB=labelstudio
27 volumes:
28 - ls_db_data:/var/lib/postgresql/data
29
30 ml-backend:
31 image: heartexlabs/label-studio-ml-backend:latest
32 container_name: ls-ml-backend
33 restart: unless-stopped
34 ports:
35 - "${ML_PORT:-9090}:9090"
36 environment:
37 - LABEL_STUDIO_URL=http://label-studio:8080
38 volumes:
39 - ml_models:/models
40
41volumes:
42 ls_data:
43 ls_db_data:
44 ml_models:

.env Template

.env
1# Label Studio
2LS_PORT=8080
3DB_PASSWORD=labelstudio_password
4ML_PORT=9090

Usage Notes

  1. 1Docs: https://labelstud.io/guide/
  2. 2Label Studio at http://localhost:8080 - create account on first access
  3. 3ML backend at http://localhost:9090 for pre-annotations
  4. 4Configure ML backend in Project Settings > Machine Learning
  5. 5Export to COCO, YOLO, Pascal VOC, CONLL, JSON formats
  6. 6Active learning: ML backend suggests labels, humans verify

Individual Services(3 services)

Copy individual services to mix and match with your existing compose files.

label-studio
label-studio:
  image: heartexlabs/label-studio:latest
  container_name: label-studio
  restart: unless-stopped
  ports:
    - ${LS_PORT:-8080}:8080
  environment:
    - DJANGO_DB=default
    - POSTGRE_HOST=label-studio-db
    - POSTGRE_NAME=labelstudio
    - POSTGRE_USER=labelstudio
    - POSTGRE_PASSWORD=${DB_PASSWORD}
  volumes:
    - ls_data:/label-studio/data
  depends_on:
    - label-studio-db
label-studio-db
label-studio-db:
  image: postgres:15-alpine
  container_name: label-studio-db
  restart: unless-stopped
  environment:
    - POSTGRES_USER=labelstudio
    - POSTGRES_PASSWORD=${DB_PASSWORD}
    - POSTGRES_DB=labelstudio
  volumes:
    - ls_db_data:/var/lib/postgresql/data
ml-backend
ml-backend:
  image: heartexlabs/label-studio-ml-backend:latest
  container_name: ls-ml-backend
  restart: unless-stopped
  ports:
    - ${ML_PORT:-9090}:9090
  environment:
    - LABEL_STUDIO_URL=http://label-studio:8080
  volumes:
    - ml_models:/models

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 label-studio:
5 image: heartexlabs/label-studio:latest
6 container_name: label-studio
7 restart: unless-stopped
8 ports:
9 - "${LS_PORT:-8080}:8080"
10 environment:
11 - DJANGO_DB=default
12 - POSTGRE_HOST=label-studio-db
13 - POSTGRE_NAME=labelstudio
14 - POSTGRE_USER=labelstudio
15 - POSTGRE_PASSWORD=${DB_PASSWORD}
16 volumes:
17 - ls_data:/label-studio/data
18 depends_on:
19 - label-studio-db
20
21 label-studio-db:
22 image: postgres:15-alpine
23 container_name: label-studio-db
24 restart: unless-stopped
25 environment:
26 - POSTGRES_USER=labelstudio
27 - POSTGRES_PASSWORD=${DB_PASSWORD}
28 - POSTGRES_DB=labelstudio
29 volumes:
30 - ls_db_data:/var/lib/postgresql/data
31
32 ml-backend:
33 image: heartexlabs/label-studio-ml-backend:latest
34 container_name: ls-ml-backend
35 restart: unless-stopped
36 ports:
37 - "${ML_PORT:-9090}:9090"
38 environment:
39 - LABEL_STUDIO_URL=http://label-studio:8080
40 volumes:
41 - ml_models:/models
42
43volumes:
44 ls_data:
45 ls_db_data:
46 ml_models:
47EOF
48
49# 2. Create the .env file
50cat > .env << 'EOF'
51# Label Studio
52LS_PORT=8080
53DB_PASSWORD=labelstudio_password
54ML_PORT=9090
55EOF
56
57# 3. Start the services
58docker compose up -d
59
60# 4. View logs
61docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/label-studio-ml/run | bash

Troubleshooting

  • Label Studio shows database connection errors: Verify DB_PASSWORD environment variable is set and label-studio-db container is running
  • ML backend not appearing in Project Settings: Check that ml-backend container is accessible on port 9090 and LABEL_STUDIO_URL is correctly configured
  • Annotations not persisting after container restart: Ensure ls_data and ls_db_data volumes are properly mounted and have write permissions
  • ML backend models not loading or training: Verify ml_models volume has sufficient disk space and the ML backend has network access to Label Studio API
  • Web interface returns 500 errors on startup: Check label-studio container logs for Django database migration issues and ensure PostgreSQL is fully initialized
  • Export functions failing with format errors: Validate that annotation schema matches the requirements of target export format (COCO, YOLO, etc.)

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space