docker.recipes

Kubeflow Pipelines Standalone

advanced

Kubeflow Pipelines for ML workflow orchestration without full Kubernetes.

Overview

Kubeflow Pipelines is Google's open-source machine learning workflow orchestration platform, originally designed to run on Kubernetes clusters. It enables data scientists and ML engineers to build, deploy, and manage end-to-end ML workflows as code, with features like experiment tracking, pipeline versioning, and automated artifact management. This standalone deployment brings Kubeflow Pipelines capabilities to environments without full Kubernetes infrastructure. This stack combines the Kubeflow Pipelines API server and frontend with MySQL for metadata storage and MinIO for artifact storage. MySQL handles pipeline definitions, execution history, and experiment metadata, while MinIO provides S3-compatible object storage for datasets, models, and pipeline artifacts. The API server orchestrates workflow execution and the frontend provides a web interface for pipeline management and visualization. This configuration targets ML teams, data scientists, and organizations wanting to implement MLOps practices without the complexity of managing a full Kubernetes cluster. It's particularly valuable for smaller teams, development environments, edge deployments, or organizations with limited infrastructure resources who still need robust ML pipeline orchestration capabilities.

Key Features

  • Visual pipeline designer with drag-and-drop interface for building ML workflows
  • Experiment tracking with metric visualization and comparison across pipeline runs
  • Artifact lineage tracking showing data flow from raw datasets to trained models
  • Pipeline versioning and rollback capabilities with Git-like workflow management
  • S3-compatible artifact storage through MinIO for seamless cloud migration
  • RESTful API for programmatic pipeline creation and execution using KFP SDK
  • MySQL-backed metadata store for reliable pipeline state and execution history
  • Component sharing and reuse across different pipeline implementations

Common Use Cases

  • 1ML model training pipelines with automated data preprocessing and validation
  • 2Batch inference workflows for processing large datasets on scheduled intervals
  • 3A/B testing frameworks for comparing model performance across different algorithms
  • 4Data pipeline orchestration for ETL processes in analytics workflows
  • 5Model retraining automation triggered by data drift detection or performance degradation
  • 6Multi-stage deployment pipelines with staging and production model validation
  • 7Research experimentation environments for data science teams to track experiments

Prerequisites

  • Docker and Docker Compose installed with minimum 4GB RAM available for containers
  • Python environment with Kubeflow Pipelines SDK (kfp) installed for pipeline development
  • Basic understanding of machine learning workflows and pipeline concepts
  • Ports 80, 8888, 9000, and 9001 available for UI, API server, and MinIO access
  • Environment variables configured for MySQL credentials and MinIO access keys
  • Familiarity with containerized ML workloads and artifact management practices

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 mysql:
3 image: mysql:8.0
4 environment:
5 - MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
6 - MYSQL_DATABASE=mlpipeline
7 - MYSQL_USER=${MYSQL_USER}
8 - MYSQL_PASSWORD=${MYSQL_PASSWORD}
9 volumes:
10 - mysql-data:/var/lib/mysql
11 networks:
12 - kubeflow-network
13 restart: unless-stopped
14
15 minio:
16 image: minio/minio:latest
17 command: server /data --console-address ":9001"
18 environment:
19 - MINIO_ROOT_USER=${MINIO_ACCESS_KEY}
20 - MINIO_ROOT_PASSWORD=${MINIO_SECRET_KEY}
21 volumes:
22 - minio-data:/data
23 ports:
24 - "9000:9000"
25 - "9001:9001"
26 networks:
27 - kubeflow-network
28 restart: unless-stopped
29
30 api-server:
31 image: gcr.io/ml-pipeline/api-server:latest
32 environment:
33 - OBJECTSTORECONFIG_BUCKETNAME=mlpipeline
34 - OBJECTSTORECONFIG_ACCESSKEY=${MINIO_ACCESS_KEY}
35 - OBJECTSTORECONFIG_SECRETACCESSKEY=${MINIO_SECRET_KEY}
36 - OBJECTSTORECONFIG_HOST=minio
37 - OBJECTSTORECONFIG_PORT=9000
38 - DBCONFIG_HOST=mysql
39 - DBCONFIG_USER=${MYSQL_USER}
40 - DBCONFIG_PASSWORD=${MYSQL_PASSWORD}
41 ports:
42 - "8888:8888"
43 depends_on:
44 - mysql
45 - minio
46 networks:
47 - kubeflow-network
48 restart: unless-stopped
49
50 ui:
51 image: gcr.io/ml-pipeline/frontend:latest
52 environment:
53 - ML_PIPELINE_SERVICE_HOST=api-server
54 - ML_PIPELINE_SERVICE_PORT=8888
55 ports:
56 - "80:3000"
57 depends_on:
58 - api-server
59 networks:
60 - kubeflow-network
61 restart: unless-stopped
62
63volumes:
64 mysql-data:
65 minio-data:
66
67networks:
68 kubeflow-network:
69 driver: bridge

.env Template

.env
1# Kubeflow Pipelines
2MYSQL_ROOT_PASSWORD=secure_root_password
3MYSQL_USER=mlpipeline
4MYSQL_PASSWORD=secure_mysql_password
5
6# MinIO
7MINIO_ACCESS_KEY=minioaccesskey
8MINIO_SECRET_KEY=secure_minio_secret

Usage Notes

  1. 1Pipelines UI at http://localhost
  2. 2API server at http://localhost:8888
  3. 3MinIO console at http://localhost:9001
  4. 4Standalone mode without Kubernetes
  5. 5Use KFP SDK to create pipelines

Individual Services(4 services)

Copy individual services to mix and match with your existing compose files.

mysql
mysql:
  image: mysql:8.0
  environment:
    - MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
    - MYSQL_DATABASE=mlpipeline
    - MYSQL_USER=${MYSQL_USER}
    - MYSQL_PASSWORD=${MYSQL_PASSWORD}
  volumes:
    - mysql-data:/var/lib/mysql
  networks:
    - kubeflow-network
  restart: unless-stopped
minio
minio:
  image: minio/minio:latest
  command: server /data --console-address ":9001"
  environment:
    - MINIO_ROOT_USER=${MINIO_ACCESS_KEY}
    - MINIO_ROOT_PASSWORD=${MINIO_SECRET_KEY}
  volumes:
    - minio-data:/data
  ports:
    - "9000:9000"
    - "9001:9001"
  networks:
    - kubeflow-network
  restart: unless-stopped
api-server
api-server:
  image: gcr.io/ml-pipeline/api-server:latest
  environment:
    - OBJECTSTORECONFIG_BUCKETNAME=mlpipeline
    - OBJECTSTORECONFIG_ACCESSKEY=${MINIO_ACCESS_KEY}
    - OBJECTSTORECONFIG_SECRETACCESSKEY=${MINIO_SECRET_KEY}
    - OBJECTSTORECONFIG_HOST=minio
    - OBJECTSTORECONFIG_PORT=9000
    - DBCONFIG_HOST=mysql
    - DBCONFIG_USER=${MYSQL_USER}
    - DBCONFIG_PASSWORD=${MYSQL_PASSWORD}
  ports:
    - "8888:8888"
  depends_on:
    - mysql
    - minio
  networks:
    - kubeflow-network
  restart: unless-stopped
ui
ui:
  image: gcr.io/ml-pipeline/frontend:latest
  environment:
    - ML_PIPELINE_SERVICE_HOST=api-server
    - ML_PIPELINE_SERVICE_PORT=8888
  ports:
    - "80:3000"
  depends_on:
    - api-server
  networks:
    - kubeflow-network
  restart: unless-stopped

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 mysql:
5 image: mysql:8.0
6 environment:
7 - MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASSWORD}
8 - MYSQL_DATABASE=mlpipeline
9 - MYSQL_USER=${MYSQL_USER}
10 - MYSQL_PASSWORD=${MYSQL_PASSWORD}
11 volumes:
12 - mysql-data:/var/lib/mysql
13 networks:
14 - kubeflow-network
15 restart: unless-stopped
16
17 minio:
18 image: minio/minio:latest
19 command: server /data --console-address ":9001"
20 environment:
21 - MINIO_ROOT_USER=${MINIO_ACCESS_KEY}
22 - MINIO_ROOT_PASSWORD=${MINIO_SECRET_KEY}
23 volumes:
24 - minio-data:/data
25 ports:
26 - "9000:9000"
27 - "9001:9001"
28 networks:
29 - kubeflow-network
30 restart: unless-stopped
31
32 api-server:
33 image: gcr.io/ml-pipeline/api-server:latest
34 environment:
35 - OBJECTSTORECONFIG_BUCKETNAME=mlpipeline
36 - OBJECTSTORECONFIG_ACCESSKEY=${MINIO_ACCESS_KEY}
37 - OBJECTSTORECONFIG_SECRETACCESSKEY=${MINIO_SECRET_KEY}
38 - OBJECTSTORECONFIG_HOST=minio
39 - OBJECTSTORECONFIG_PORT=9000
40 - DBCONFIG_HOST=mysql
41 - DBCONFIG_USER=${MYSQL_USER}
42 - DBCONFIG_PASSWORD=${MYSQL_PASSWORD}
43 ports:
44 - "8888:8888"
45 depends_on:
46 - mysql
47 - minio
48 networks:
49 - kubeflow-network
50 restart: unless-stopped
51
52 ui:
53 image: gcr.io/ml-pipeline/frontend:latest
54 environment:
55 - ML_PIPELINE_SERVICE_HOST=api-server
56 - ML_PIPELINE_SERVICE_PORT=8888
57 ports:
58 - "80:3000"
59 depends_on:
60 - api-server
61 networks:
62 - kubeflow-network
63 restart: unless-stopped
64
65volumes:
66 mysql-data:
67 minio-data:
68
69networks:
70 kubeflow-network:
71 driver: bridge
72EOF
73
74# 2. Create the .env file
75cat > .env << 'EOF'
76# Kubeflow Pipelines
77MYSQL_ROOT_PASSWORD=secure_root_password
78MYSQL_USER=mlpipeline
79MYSQL_PASSWORD=secure_mysql_password
80
81# MinIO
82MINIO_ACCESS_KEY=minioaccesskey
83MINIO_SECRET_KEY=secure_minio_secret
84EOF
85
86# 3. Start the services
87docker compose up -d
88
89# 4. View logs
90docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/kubeflow-pipelines/run | bash

Troubleshooting

  • Pipeline execution fails with 'ObjectStore connection error': Verify MINIO_ACCESS_KEY and MINIO_SECRET_KEY match between api-server and minio services
  • UI shows 'Failed to fetch pipelines' error: Check that api-server container is running and accessible on port 8888
  • MySQL connection refused during startup: Ensure mysql container has sufficient time to initialize before api-server starts, add healthcheck or sleep
  • MinIO console inaccessible at localhost:9001: Verify MinIO container started with console-address parameter and port 9001 is not blocked by firewall
  • Pipeline artifacts not persisting between restarts: Confirm minio-data and mysql-data volumes are properly mounted and have write permissions
  • KFP SDK cannot connect to API server: Verify kfp.Client() points to http://localhost:8888 and api-server container network configuration

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Components

kubeflow-pipelinesmysqlminioworkflow-controller

Tags

#kubeflow#pipelines#ml-ops#workflow#orchestration

Category

AI & Machine Learning
Ad Space