docker.recipes

Kubeflow Pipelines

advanced

ML workflow orchestration on Kubernetes.

Overview

Kubeflow Pipelines is a comprehensive platform for building and deploying portable, scalable machine learning workflows based on Docker containers. Originally developed by Google and now part of the Cloud Native Computing Foundation, Kubeflow enables data scientists and ML engineers to orchestrate complex machine learning experiments and production workflows on Kubernetes. The platform provides a web-based UI for managing pipelines, experiments, and runs, along with a Python SDK for programmatic pipeline creation. This configuration establishes the essential storage infrastructure required for Kubeflow Pipelines development and testing. MinIO serves as the S3-compatible object storage backend for artifacts, models, and intermediate data, while MySQL provides the metadata store for pipeline definitions, experiment tracking, and run histories. These components form the foundational data layer that Kubeflow Pipelines requires for persistent storage of ML workflows and their associated artifacts. This stack is ideal for ML engineers setting up local development environments, data science teams prototyping pipeline architectures, and organizations preparing infrastructure for Kubeflow deployment. While full Kubeflow requires Kubernetes orchestration, this foundation enables developers to build and test pipelines using the Kubeflow Pipelines SDK before deploying to production Kubernetes clusters. The combination provides a lightweight development environment that mirrors the storage patterns of full Kubeflow deployments.

Key Features

  • S3-compatible object storage via MinIO for ML artifacts and model persistence
  • MySQL metadata store for pipeline definitions and experiment tracking
  • Compatible with Kubeflow Pipelines SDK for local pipeline development
  • MinIO web interface for browsing and managing ML artifacts and datasets
  • Persistent volume storage ensuring data survives container restarts
  • Network isolation between storage components for development security
  • Configurable MySQL database schema supporting MLPipeline metadata
  • MinIO bucket structure supporting Kubeflow's artifact organization patterns

Common Use Cases

  • 1Local development environment for Kubeflow pipeline authors and data scientists
  • 2ML artifact storage and versioning during model experimentation phases
  • 3Prototype testing of pipeline storage patterns before Kubernetes deployment
  • 4Educational environments for learning Kubeflow Pipelines concepts and workflows
  • 5CI/CD pipeline testing for ML workflows requiring persistent storage backends
  • 6Data science team collaboration with shared artifact storage and experiment tracking
  • 7Migration testing when moving ML workflows from other platforms to Kubeflow

Prerequisites

  • Docker and Docker Compose installed with at least 4GB available memory
  • Port 9000 available for MinIO web interface and API access
  • Python 3.7+ with kfp (Kubeflow Pipelines SDK) installed for pipeline development
  • Basic understanding of ML workflows and pipeline orchestration concepts
  • Familiarity with S3-compatible storage APIs for artifact management
  • 8GB+ disk space for MySQL data and MinIO object storage

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1# Kubeflow requires Kubernetes
2# Use kind or minikube for local development
3# This is a simplified standalone example
4services:
5 minio:
6 image: minio/minio:latest
7 container_name: kubeflow-minio
8 command: server /data
9 environment:
10 MINIO_ROOT_USER: minioadmin
11 MINIO_ROOT_PASSWORD: minioadmin
12 volumes:
13 - minio_data:/data
14 ports:
15 - "9000:9000"
16 networks:
17 - kubeflow
18
19 mysql:
20 image: mysql:8.0
21 container_name: kubeflow-mysql
22 environment:
23 MYSQL_ROOT_PASSWORD: root
24 MYSQL_DATABASE: mlpipeline
25 volumes:
26 - mysql_data:/var/lib/mysql
27 networks:
28 - kubeflow
29
30volumes:
31 minio_data:
32 mysql_data:
33
34networks:
35 kubeflow:
36 driver: bridge

.env Template

.env
1# Full Kubeflow requires Kubernetes

Usage Notes

  1. 1Docs: https://www.kubeflow.org/docs/
  2. 2Full Kubeflow requires Kubernetes - this provides storage backend only
  3. 3Author pipelines with kfp SDK: pip install kfp
  4. 4MinIO at http://localhost:9000 - credentials: minioadmin/minioadmin
  5. 5For local dev, use kind or minikube with Kubeflow manifests
  6. 6Alternative: use standalone Kubeflow Pipelines v2 on K8s

Individual Services(2 services)

Copy individual services to mix and match with your existing compose files.

minio
minio:
  image: minio/minio:latest
  container_name: kubeflow-minio
  command: server /data
  environment:
    MINIO_ROOT_USER: minioadmin
    MINIO_ROOT_PASSWORD: minioadmin
  volumes:
    - minio_data:/data
  ports:
    - "9000:9000"
  networks:
    - kubeflow
mysql
mysql:
  image: mysql:8.0
  container_name: kubeflow-mysql
  environment:
    MYSQL_ROOT_PASSWORD: root
    MYSQL_DATABASE: mlpipeline
  volumes:
    - mysql_data:/var/lib/mysql
  networks:
    - kubeflow

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3# Kubeflow requires Kubernetes
4# Use kind or minikube for local development
5# This is a simplified standalone example
6services:
7 minio:
8 image: minio/minio:latest
9 container_name: kubeflow-minio
10 command: server /data
11 environment:
12 MINIO_ROOT_USER: minioadmin
13 MINIO_ROOT_PASSWORD: minioadmin
14 volumes:
15 - minio_data:/data
16 ports:
17 - "9000:9000"
18 networks:
19 - kubeflow
20
21 mysql:
22 image: mysql:8.0
23 container_name: kubeflow-mysql
24 environment:
25 MYSQL_ROOT_PASSWORD: root
26 MYSQL_DATABASE: mlpipeline
27 volumes:
28 - mysql_data:/var/lib/mysql
29 networks:
30 - kubeflow
31
32volumes:
33 minio_data:
34 mysql_data:
35
36networks:
37 kubeflow:
38 driver: bridge
39EOF
40
41# 2. Create the .env file
42cat > .env << 'EOF'
43# Full Kubeflow requires Kubernetes
44EOF
45
46# 3. Start the services
47docker compose up -d
48
49# 4. View logs
50docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/kubeflow/run | bash

Troubleshooting

  • MinIO 'Access Denied' errors: Verify credentials are minioadmin/minioadmin and check bucket policies in the web interface
  • MySQL connection refused: Ensure MySQL container is fully started (check logs for 'ready for connections') before connecting applications
  • KFP SDK artifact upload failures: Verify MinIO endpoint configuration and ensure bucket exists for the pipeline namespace
  • MySQL 'table doesn't exist' errors: Run pipeline metadata initialization scripts or ensure MLPipeline database schema is properly created
  • MinIO console not accessible: Check that port 9000 is not blocked by firewall and container is running with correct port mapping
  • Pipeline metadata persistence issues: Verify MySQL volume mounts are correctly configured and database has sufficient disk space

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space