Label Studio ML Annotation
Label Studio for data labeling with ML backend.
Overview
Label Studio is an open-source data labeling platform developed by HumanSignal that enables teams to annotate various data types including text, images, audio, video, and time series data. Originally created to address the bottleneck of high-quality data preparation in machine learning workflows, Label Studio has evolved into a comprehensive annotation platform that supports over 30 data types and integrates with popular ML frameworks. It provides a flexible interface for creating custom labeling configurations and supports collaborative annotation workflows with quality control features.
This deployment consists of three interconnected services: the main Label Studio application server, a PostgreSQL database for persistent data storage, and a dedicated ML backend service for automated pre-annotations. The Label Studio service runs the core web application and API, while the PostgreSQL database stores project configurations, user data, annotations, and metadata. The ML backend service operates independently to provide machine learning-powered suggestions and active learning capabilities, connecting to the main Label Studio instance via HTTP API calls.
This configuration is ideal for data science teams, AI researchers, and organizations looking to implement production-grade data labeling workflows with machine learning acceleration. The combination of Label Studio's versatile annotation interface with an integrated ML backend creates a powerful active learning environment where models can pre-annotate data and continuously improve based on human feedback, significantly reducing manual labeling effort while maintaining annotation quality.
Key Features
- Multi-format data annotation supporting text, images, audio, video, and time series with customizable labeling interfaces
- PostgreSQL-backed persistent storage for annotations, projects, and user management with ACID compliance
- Integrated ML backend service providing automated pre-annotations and active learning capabilities
- Export functionality to popular ML formats including COCO, YOLO, Pascal VOC, CONLL, and JSON
- Web-based collaborative annotation interface with user roles, task assignment, and quality control
- RESTful API for programmatic access to projects, tasks, and annotations
- Custom labeling configuration templates using XML-based markup for domain-specific workflows
- Real-time model training integration with the ability to retrain models based on new annotations
Common Use Cases
- 1Computer vision projects requiring object detection, image segmentation, or classification annotations with YOLO/COCO export
- 2Natural language processing tasks including named entity recognition, sentiment analysis, and text classification
- 3Active learning workflows where ML models iteratively improve by suggesting labels for human verification
- 4Multi-annotator projects requiring consensus building and inter-annotator agreement measurement
- 5Audio and video content labeling for speech recognition, sound classification, or video analysis projects
- 6Research environments where custom annotation schemas need to be rapidly prototyped and deployed
- 7Production ML pipelines requiring continuous data labeling and model retraining cycles
Prerequisites
- Docker and Docker Compose installed with minimum 2GB RAM available for the three-service stack
- Port 8080 available for Label Studio web interface and port 9090 for ML backend API
- Database password configured in environment variables for PostgreSQL authentication
- Basic understanding of data annotation workflows and machine learning model integration concepts
- Familiarity with Label Studio's XML-based labeling configuration syntax for custom interfaces
- Knowledge of target export formats (COCO, YOLO, etc.) if integrating with specific ML frameworks
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 label-studio: 3 image: heartexlabs/label-studio:latest4 container_name: label-studio5 restart: unless-stopped6 ports: 7 - "${LS_PORT:-8080}:8080"8 environment: 9 - DJANGO_DB=default10 - POSTGRE_HOST=label-studio-db11 - POSTGRE_NAME=labelstudio12 - POSTGRE_USER=labelstudio13 - POSTGRE_PASSWORD=${DB_PASSWORD}14 volumes: 15 - ls_data:/label-studio/data16 depends_on: 17 - label-studio-db1819 label-studio-db: 20 image: postgres:15-alpine21 container_name: label-studio-db22 restart: unless-stopped23 environment: 24 - POSTGRES_USER=labelstudio25 - POSTGRES_PASSWORD=${DB_PASSWORD}26 - POSTGRES_DB=labelstudio27 volumes: 28 - ls_db_data:/var/lib/postgresql/data2930 ml-backend: 31 image: heartexlabs/label-studio-ml-backend:latest32 container_name: ls-ml-backend33 restart: unless-stopped34 ports: 35 - "${ML_PORT:-9090}:9090"36 environment: 37 - LABEL_STUDIO_URL=http://label-studio:808038 volumes: 39 - ml_models:/models4041volumes: 42 ls_data: 43 ls_db_data: 44 ml_models: .env Template
.env
1# Label Studio2LS_PORT=80803DB_PASSWORD=labelstudio_password4ML_PORT=9090Usage Notes
- 1Docs: https://labelstud.io/guide/
- 2Label Studio at http://localhost:8080 - create account on first access
- 3ML backend at http://localhost:9090 for pre-annotations
- 4Configure ML backend in Project Settings > Machine Learning
- 5Export to COCO, YOLO, Pascal VOC, CONLL, JSON formats
- 6Active learning: ML backend suggests labels, humans verify
Individual Services(3 services)
Copy individual services to mix and match with your existing compose files.
label-studio
label-studio:
image: heartexlabs/label-studio:latest
container_name: label-studio
restart: unless-stopped
ports:
- ${LS_PORT:-8080}:8080
environment:
- DJANGO_DB=default
- POSTGRE_HOST=label-studio-db
- POSTGRE_NAME=labelstudio
- POSTGRE_USER=labelstudio
- POSTGRE_PASSWORD=${DB_PASSWORD}
volumes:
- ls_data:/label-studio/data
depends_on:
- label-studio-db
label-studio-db
label-studio-db:
image: postgres:15-alpine
container_name: label-studio-db
restart: unless-stopped
environment:
- POSTGRES_USER=labelstudio
- POSTGRES_PASSWORD=${DB_PASSWORD}
- POSTGRES_DB=labelstudio
volumes:
- ls_db_data:/var/lib/postgresql/data
ml-backend
ml-backend:
image: heartexlabs/label-studio-ml-backend:latest
container_name: ls-ml-backend
restart: unless-stopped
ports:
- ${ML_PORT:-9090}:9090
environment:
- LABEL_STUDIO_URL=http://label-studio:8080
volumes:
- ml_models:/models
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 label-studio:5 image: heartexlabs/label-studio:latest6 container_name: label-studio7 restart: unless-stopped8 ports:9 - "${LS_PORT:-8080}:8080"10 environment:11 - DJANGO_DB=default12 - POSTGRE_HOST=label-studio-db13 - POSTGRE_NAME=labelstudio14 - POSTGRE_USER=labelstudio15 - POSTGRE_PASSWORD=${DB_PASSWORD}16 volumes:17 - ls_data:/label-studio/data18 depends_on:19 - label-studio-db2021 label-studio-db:22 image: postgres:15-alpine23 container_name: label-studio-db24 restart: unless-stopped25 environment:26 - POSTGRES_USER=labelstudio27 - POSTGRES_PASSWORD=${DB_PASSWORD}28 - POSTGRES_DB=labelstudio29 volumes:30 - ls_db_data:/var/lib/postgresql/data3132 ml-backend:33 image: heartexlabs/label-studio-ml-backend:latest34 container_name: ls-ml-backend35 restart: unless-stopped36 ports:37 - "${ML_PORT:-9090}:9090"38 environment:39 - LABEL_STUDIO_URL=http://label-studio:808040 volumes:41 - ml_models:/models4243volumes:44 ls_data:45 ls_db_data:46 ml_models:47EOF4849# 2. Create the .env file50cat > .env << 'EOF'51# Label Studio52LS_PORT=808053DB_PASSWORD=labelstudio_password54ML_PORT=909055EOF5657# 3. Start the services58docker compose up -d5960# 4. View logs61docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/label-studio-ml/run | bashTroubleshooting
- Label Studio shows database connection errors: Verify DB_PASSWORD environment variable is set and label-studio-db container is running
- ML backend not appearing in Project Settings: Check that ml-backend container is accessible on port 9090 and LABEL_STUDIO_URL is correctly configured
- Annotations not persisting after container restart: Ensure ls_data and ls_db_data volumes are properly mounted and have write permissions
- ML backend models not loading or training: Verify ml_models volume has sufficient disk space and the ML backend has network access to Label Studio API
- Web interface returns 500 errors on startup: Check label-studio container logs for Django database migration issues and ensure PostgreSQL is fully initialized
- Export functions failing with format errors: Validate that annotation schema matches the requirements of target export format (COCO, YOLO, etc.)
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
label-studiopostgresqlml-backend
Tags
#labeling#annotation#ml#data
Category
AI & Machine LearningAd Space
Shortcuts: C CopyF FavoriteD Download