Kubeflow Pipelines
ML workflow orchestration on Kubernetes.
Overview
Kubeflow Pipelines is a comprehensive platform for building and deploying portable, scalable machine learning workflows based on Docker containers. Originally developed by Google and now part of the Cloud Native Computing Foundation, Kubeflow enables data scientists and ML engineers to orchestrate complex machine learning experiments and production workflows on Kubernetes. The platform provides a web-based UI for managing pipelines, experiments, and runs, along with a Python SDK for programmatic pipeline creation.
This configuration establishes the essential storage infrastructure required for Kubeflow Pipelines development and testing. MinIO serves as the S3-compatible object storage backend for artifacts, models, and intermediate data, while MySQL provides the metadata store for pipeline definitions, experiment tracking, and run histories. These components form the foundational data layer that Kubeflow Pipelines requires for persistent storage of ML workflows and their associated artifacts.
This stack is ideal for ML engineers setting up local development environments, data science teams prototyping pipeline architectures, and organizations preparing infrastructure for Kubeflow deployment. While full Kubeflow requires Kubernetes orchestration, this foundation enables developers to build and test pipelines using the Kubeflow Pipelines SDK before deploying to production Kubernetes clusters. The combination provides a lightweight development environment that mirrors the storage patterns of full Kubeflow deployments.
Key Features
- S3-compatible object storage via MinIO for ML artifacts and model persistence
- MySQL metadata store for pipeline definitions and experiment tracking
- Compatible with Kubeflow Pipelines SDK for local pipeline development
- MinIO web interface for browsing and managing ML artifacts and datasets
- Persistent volume storage ensuring data survives container restarts
- Network isolation between storage components for development security
- Configurable MySQL database schema supporting MLPipeline metadata
- MinIO bucket structure supporting Kubeflow's artifact organization patterns
Common Use Cases
- 1Local development environment for Kubeflow pipeline authors and data scientists
- 2ML artifact storage and versioning during model experimentation phases
- 3Prototype testing of pipeline storage patterns before Kubernetes deployment
- 4Educational environments for learning Kubeflow Pipelines concepts and workflows
- 5CI/CD pipeline testing for ML workflows requiring persistent storage backends
- 6Data science team collaboration with shared artifact storage and experiment tracking
- 7Migration testing when moving ML workflows from other platforms to Kubeflow
Prerequisites
- Docker and Docker Compose installed with at least 4GB available memory
- Port 9000 available for MinIO web interface and API access
- Python 3.7+ with kfp (Kubeflow Pipelines SDK) installed for pipeline development
- Basic understanding of ML workflows and pipeline orchestration concepts
- Familiarity with S3-compatible storage APIs for artifact management
- 8GB+ disk space for MySQL data and MinIO object storage
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1# Kubeflow requires Kubernetes2# Use kind or minikube for local development3# This is a simplified standalone example4services: 5 minio: 6 image: minio/minio:latest7 container_name: kubeflow-minio8 command: server /data9 environment: 10 MINIO_ROOT_USER: minioadmin11 MINIO_ROOT_PASSWORD: minioadmin12 volumes: 13 - minio_data:/data14 ports: 15 - "9000:9000"16 networks: 17 - kubeflow1819 mysql: 20 image: mysql:8.021 container_name: kubeflow-mysql22 environment: 23 MYSQL_ROOT_PASSWORD: root24 MYSQL_DATABASE: mlpipeline25 volumes: 26 - mysql_data:/var/lib/mysql27 networks: 28 - kubeflow2930volumes: 31 minio_data: 32 mysql_data: 3334networks: 35 kubeflow: 36 driver: bridge.env Template
.env
1# Full Kubeflow requires KubernetesUsage Notes
- 1Docs: https://www.kubeflow.org/docs/
- 2Full Kubeflow requires Kubernetes - this provides storage backend only
- 3Author pipelines with kfp SDK: pip install kfp
- 4MinIO at http://localhost:9000 - credentials: minioadmin/minioadmin
- 5For local dev, use kind or minikube with Kubeflow manifests
- 6Alternative: use standalone Kubeflow Pipelines v2 on K8s
Individual Services(2 services)
Copy individual services to mix and match with your existing compose files.
minio
minio:
image: minio/minio:latest
container_name: kubeflow-minio
command: server /data
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
volumes:
- minio_data:/data
ports:
- "9000:9000"
networks:
- kubeflow
mysql
mysql:
image: mysql:8.0
container_name: kubeflow-mysql
environment:
MYSQL_ROOT_PASSWORD: root
MYSQL_DATABASE: mlpipeline
volumes:
- mysql_data:/var/lib/mysql
networks:
- kubeflow
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3# Kubeflow requires Kubernetes4# Use kind or minikube for local development5# This is a simplified standalone example6services:7 minio:8 image: minio/minio:latest9 container_name: kubeflow-minio10 command: server /data11 environment:12 MINIO_ROOT_USER: minioadmin13 MINIO_ROOT_PASSWORD: minioadmin14 volumes:15 - minio_data:/data16 ports:17 - "9000:9000"18 networks:19 - kubeflow2021 mysql:22 image: mysql:8.023 container_name: kubeflow-mysql24 environment:25 MYSQL_ROOT_PASSWORD: root26 MYSQL_DATABASE: mlpipeline27 volumes:28 - mysql_data:/var/lib/mysql29 networks:30 - kubeflow3132volumes:33 minio_data:34 mysql_data:3536networks:37 kubeflow:38 driver: bridge39EOF4041# 2. Create the .env file42cat > .env << 'EOF'43# Full Kubeflow requires Kubernetes44EOF4546# 3. Start the services47docker compose up -d4849# 4. View logs50docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/kubeflow/run | bashTroubleshooting
- MinIO 'Access Denied' errors: Verify credentials are minioadmin/minioadmin and check bucket policies in the web interface
- MySQL connection refused: Ensure MySQL container is fully started (check logs for 'ready for connections') before connecting applications
- KFP SDK artifact upload failures: Verify MinIO endpoint configuration and ensure bucket exists for the pipeline namespace
- MySQL 'table doesn't exist' errors: Run pipeline metadata initialization scripts or ensure MLPipeline database schema is properly created
- MinIO console not accessible: Check that port 9000 is not blocked by firewall and container is running with correct port mapping
- Pipeline metadata persistence issues: Verify MySQL volume mounts are correctly configured and database has sufficient disk space
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Ad Space
Shortcuts: C CopyF FavoriteD Download