ClearML
ML experiment tracking and automation platform.
Overview
ClearML is an open-source MLOps platform that automates machine learning experiment tracking, model management, and workflow orchestration. Originally developed by Allegro AI, ClearML addresses the critical challenge of reproducibility in machine learning by automatically capturing experiment parameters, metrics, and artifacts without requiring code changes. It integrates with popular ML frameworks like PyTorch, TensorFlow, and scikit-learn to provide comprehensive experiment lineage and model versioning capabilities.
This stack combines ClearML's three core services with a robust data infrastructure: MongoDB stores experiment metadata and project configurations, Elasticsearch indexes and searches through experiment logs and metrics for fast retrieval, and Redis provides high-speed caching for real-time experiment monitoring and task queuing. The architecture separates concerns with dedicated containers for the web interface, API server, and file storage, enabling scalable deployment and efficient resource utilization.
Data scientists and ML engineers working on teams will find this configuration invaluable for establishing experiment reproducibility, comparing model performance across iterations, and automating ML pipelines. Organizations transitioning from ad-hoc ML development to structured MLOps practices benefit from ClearML's ability to retrofit existing codebases with tracking capabilities, while the included data stack ensures reliable storage and fast querying of experiment history even as projects scale to thousands of experiments.
Key Features
- Automatic experiment tracking for PyTorch, TensorFlow, Keras, and scikit-learn without code modifications
- Web-based experiment comparison with interactive plots and hyperparameter visualization
- MongoDB-backed experiment metadata storage with flexible schema for diverse ML frameworks
- Elasticsearch-powered full-text search across experiment logs, parameters, and model descriptions
- Redis-accelerated real-time experiment monitoring and distributed task queue management
- File server for artifact storage including model checkpoints, datasets, and visualization assets
- REST API server enabling programmatic access to experiment data and remote task execution
- Agent-based remote execution system for distributed training and hyperparameter optimization
Common Use Cases
- 1ML research teams tracking hundreds of hyperparameter tuning experiments across multiple models
- 2Data science organizations implementing experiment reproducibility standards and audit trails
- 3Distributed machine learning workflows requiring remote agent execution on GPU clusters
- 4Model comparison and selection processes with automated metric tracking and visualization
- 5MLOps pipeline automation with programmatic experiment triggering and result analysis
- 6Academic research groups needing collaborative experiment sharing and progress tracking
- 7Production ML teams managing model versioning and deployment artifact lineage
Prerequisites
- Minimum 6GB RAM (2GB for Elasticsearch, 2GB for MongoDB, 1GB for ClearML services, 1GB system overhead)
- Docker and Docker Compose with support for multi-container networking and volume persistence
- Ports 8080, 8008, and 8081 available for ClearML web interface, API server, and file server
- Understanding of machine learning workflows and experiment tracking concepts
- Python development environment for ClearML SDK installation and configuration
- Sufficient disk space for experiment artifacts, model checkpoints, and database storage (recommend 50GB+)
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 clearml-webserver: 3 image: allegroai/clearml:latest4 container_name: clearml-webserver5 restart: unless-stopped6 environment: 7 CLEARML_HOST_IP: localhost8 CLEARML__apiserver__mongo__host: mongo9 CLEARML__apiserver__es__host: elasticsearch10 ports: 11 - "8080:8080"12 depends_on: 13 - mongo14 - elasticsearch15 - redis16 networks: 17 - clearml1819 clearml-apiserver: 20 image: allegroai/clearml:latest21 container_name: clearml-apiserver22 restart: unless-stopped23 command: apiserver24 environment: 25 CLEARML__apiserver__mongo__host: mongo26 CLEARML__apiserver__es__host: elasticsearch27 CLEARML__redis__host: redis28 ports: 29 - "8008:8008"30 depends_on: 31 - mongo32 - elasticsearch33 - redis34 networks: 35 - clearml3637 clearml-fileserver: 38 image: allegroai/clearml:latest39 container_name: clearml-fileserver40 command: fileserver41 ports: 42 - "8081:8081"43 volumes: 44 - clearml_files:/mnt/fileserver45 networks: 46 - clearml4748 mongo: 49 image: mongo:4.450 container_name: clearml-mongo51 volumes: 52 - mongo_data:/data/db53 networks: 54 - clearml5556 elasticsearch: 57 image: elasticsearch:7.17.058 container_name: clearml-es59 environment: 60 - discovery.type=single-node61 - ES_JAVA_OPTS=-Xms512m -Xmx512m62 volumes: 63 - es_data:/usr/share/elasticsearch/data64 networks: 65 - clearml6667 redis: 68 image: redis:alpine69 container_name: clearml-redis70 networks: 71 - clearml7273volumes: 74 clearml_files: 75 mongo_data: 76 es_data: 7778networks: 79 clearml: 80 driver: bridge.env Template
.env
1# Configure credentials via web UIUsage Notes
- 1Docs: https://clear.ml/docs/latest/
- 2Web UI at http://localhost:8080 - configure credentials on first visit
- 3API server at http://localhost:8008, file server at :8081
- 4Python SDK: pip install clearml, then clearml-init to configure
- 5Auto-logs PyTorch, TensorFlow, scikit-learn experiments
- 6Agent for remote execution: clearml-agent daemon --queue default
Individual Services(6 services)
Copy individual services to mix and match with your existing compose files.
clearml-webserver
clearml-webserver:
image: allegroai/clearml:latest
container_name: clearml-webserver
restart: unless-stopped
environment:
CLEARML_HOST_IP: localhost
CLEARML__apiserver__mongo__host: mongo
CLEARML__apiserver__es__host: elasticsearch
ports:
- "8080:8080"
depends_on:
- mongo
- elasticsearch
- redis
networks:
- clearml
clearml-apiserver
clearml-apiserver:
image: allegroai/clearml:latest
container_name: clearml-apiserver
restart: unless-stopped
command: apiserver
environment:
CLEARML__apiserver__mongo__host: mongo
CLEARML__apiserver__es__host: elasticsearch
CLEARML__redis__host: redis
ports:
- "8008:8008"
depends_on:
- mongo
- elasticsearch
- redis
networks:
- clearml
clearml-fileserver
clearml-fileserver:
image: allegroai/clearml:latest
container_name: clearml-fileserver
command: fileserver
ports:
- "8081:8081"
volumes:
- clearml_files:/mnt/fileserver
networks:
- clearml
mongo
mongo:
image: mongo:4.4
container_name: clearml-mongo
volumes:
- mongo_data:/data/db
networks:
- clearml
elasticsearch
elasticsearch:
image: elasticsearch:7.17.0
container_name: clearml-es
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms512m -Xmx512m
volumes:
- es_data:/usr/share/elasticsearch/data
networks:
- clearml
redis
redis:
image: redis:alpine
container_name: clearml-redis
networks:
- clearml
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 clearml-webserver:5 image: allegroai/clearml:latest6 container_name: clearml-webserver7 restart: unless-stopped8 environment:9 CLEARML_HOST_IP: localhost10 CLEARML__apiserver__mongo__host: mongo11 CLEARML__apiserver__es__host: elasticsearch12 ports:13 - "8080:8080"14 depends_on:15 - mongo16 - elasticsearch17 - redis18 networks:19 - clearml2021 clearml-apiserver:22 image: allegroai/clearml:latest23 container_name: clearml-apiserver24 restart: unless-stopped25 command: apiserver26 environment:27 CLEARML__apiserver__mongo__host: mongo28 CLEARML__apiserver__es__host: elasticsearch29 CLEARML__redis__host: redis30 ports:31 - "8008:8008"32 depends_on:33 - mongo34 - elasticsearch35 - redis36 networks:37 - clearml3839 clearml-fileserver:40 image: allegroai/clearml:latest41 container_name: clearml-fileserver42 command: fileserver43 ports:44 - "8081:8081"45 volumes:46 - clearml_files:/mnt/fileserver47 networks:48 - clearml4950 mongo:51 image: mongo:4.452 container_name: clearml-mongo53 volumes:54 - mongo_data:/data/db55 networks:56 - clearml5758 elasticsearch:59 image: elasticsearch:7.17.060 container_name: clearml-es61 environment:62 - discovery.type=single-node63 - ES_JAVA_OPTS=-Xms512m -Xmx512m64 volumes:65 - es_data:/usr/share/elasticsearch/data66 networks:67 - clearml6869 redis:70 image: redis:alpine71 container_name: clearml-redis72 networks:73 - clearml7475volumes:76 clearml_files:77 mongo_data:78 es_data:7980networks:81 clearml:82 driver: bridge83EOF8485# 2. Create the .env file86cat > .env << 'EOF'87# Configure credentials via web UI88EOF8990# 3. Start the services91docker compose up -d9293# 4. View logs94docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/clearml/run | bashTroubleshooting
- ClearML web interface shows 'Server not responding': Verify clearml-apiserver container is running and MongoDB connection is established
- Elasticsearch container exits with 'max virtual memory areas vm.max_map_count too low': Run 'sysctl -w vm.max_map_count=262144' on Docker host
- Experiment tracking fails with 'Could not connect to API server': Ensure port 8008 is accessible and run 'clearml-init' to configure SDK credentials
- MongoDB connection errors in API server logs: Check that mongo container is fully started before clearml-apiserver attempts connection
- File upload failures to clearml-fileserver: Verify clearml_files volume has proper write permissions and sufficient disk space
- Redis connection timeouts during high experiment load: Increase Redis memory limit or enable Redis persistence for data durability
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
clearmlelasticsearchmongoredis
Tags
#clearml#mlops#tracking#experiments
Category
AI & Machine LearningAd Space
Shortcuts: C CopyF FavoriteD Download