docker.recipes

Ollama with Open WebUI

beginner

Ollama local LLMs with Open WebUI chat.

Overview

Ollama is a revolutionary tool that democratizes access to large language models by enabling users to run powerful AI models like Llama 2, Mistral, and Code Llama directly on their local machines. Created to address the privacy concerns and API costs associated with cloud-based AI services, Ollama provides a simple command-line interface for downloading, managing, and serving LLMs with OpenAI-compatible APIs. Its efficient model quantization and GPU acceleration capabilities make it possible to run sophisticated AI models on consumer hardware. This stack combines Ollama's local LLM serving capabilities with Open WebUI's intuitive chat interface, creating a complete self-hosted AI chat solution. Open WebUI transforms Ollama's command-line functionality into a familiar ChatGPT-like web interface, complete with conversation history, model switching, and multi-user support. The integration allows users to interact with their locally-hosted models through a polished web interface while maintaining complete data privacy and eliminating external API dependencies. This combination is ideal for privacy-conscious individuals, developers building AI applications, organizations with sensitive data requirements, and anyone wanting to experiment with large language models without recurring costs. The stack provides enterprise-grade AI capabilities while keeping all data processing completely local, making it perfect for confidential document analysis, code generation in secure environments, and educational AI exploration without internet dependencies.

Key Features

  • Local LLM serving with OpenAI-compatible API for complete data privacy
  • ChatGPT-like web interface with conversation history and user management
  • Simple model management through Ollama's pull and run commands
  • GPU acceleration support with NVIDIA CUDA for improved inference speed
  • Multi-model support including Llama 2, Mistral, Code Llama, and custom Modelfiles
  • RAG document support for context-aware conversations with uploaded files
  • Prompt templates and custom personas for specialized AI interactions
  • Model quantization options to optimize memory usage and performance

Common Use Cases

  • 1Private AI chat for sensitive business communications and confidential document analysis
  • 2Development teams building AI-powered applications without external API dependencies
  • 3Educational institutions teaching AI concepts with hands-on local model experimentation
  • 4Healthcare organizations requiring HIPAA-compliant AI assistance for medical documentation
  • 5Legal firms using AI for document review while maintaining attorney-client privilege
  • 6Home lab enthusiasts exploring large language models without subscription costs
  • 7Offline AI applications in environments with limited or restricted internet access

Prerequisites

  • Minimum 8GB RAM (16GB+ recommended) for running 7B parameter models effectively
  • NVIDIA GPU with 6GB+ VRAM for optimal performance, or remove GPU configuration for CPU-only mode
  • Docker and Docker Compose installed with nvidia-container-toolkit for GPU support
  • Ports 3000 and 11434 available for Open WebUI and Ollama API respectively
  • Sufficient disk space for model storage (models range from 4GB to 70GB+)
  • Basic understanding of LLM concepts and model parameter sizing

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 ollama:
3 image: ollama/ollama:latest
4 container_name: ollama
5 restart: unless-stopped
6 ports:
7 - "${OLLAMA_PORT:-11434}:11434"
8 volumes:
9 - ollama_data:/root/.ollama
10 deploy:
11 resources:
12 reservations:
13 devices:
14 - driver: nvidia
15 count: all
16 capabilities: [gpu]
17
18 open-webui:
19 image: ghcr.io/open-webui/open-webui:main
20 container_name: open-webui
21 restart: unless-stopped
22 ports:
23 - "${WEBUI_PORT:-3000}:8080"
24 environment:
25 - OLLAMA_BASE_URL=http://ollama:11434
26 volumes:
27 - open_webui_data:/app/backend/data
28 depends_on:
29 - ollama
30
31volumes:
32 ollama_data:
33 open_webui_data:

.env Template

.env
1# Ollama + Open WebUI
2OLLAMA_PORT=11434
3WEBUI_PORT=3000

Usage Notes

  1. 1Docs: https://docs.openwebui.com/, https://ollama.ai/docs
  2. 2Open WebUI at http://localhost:3000 - create admin on first visit
  3. 3Pull models: docker exec ollama ollama pull llama2:7b (or mistral, codellama)
  4. 4GPU requires nvidia-container-toolkit - remove deploy section for CPU mode
  5. 5Chat history, multiple users, and custom personas supported
  6. 6Connect additional OpenAI-compatible APIs in Settings > Connections

Individual Services(2 services)

Copy individual services to mix and match with your existing compose files.

ollama
ollama:
  image: ollama/ollama:latest
  container_name: ollama
  restart: unless-stopped
  ports:
    - ${OLLAMA_PORT:-11434}:11434
  volumes:
    - ollama_data:/root/.ollama
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            count: all
            capabilities:
              - gpu
open-webui
open-webui:
  image: ghcr.io/open-webui/open-webui:main
  container_name: open-webui
  restart: unless-stopped
  ports:
    - ${WEBUI_PORT:-3000}:8080
  environment:
    - OLLAMA_BASE_URL=http://ollama:11434
  volumes:
    - open_webui_data:/app/backend/data
  depends_on:
    - ollama

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 ollama:
5 image: ollama/ollama:latest
6 container_name: ollama
7 restart: unless-stopped
8 ports:
9 - "${OLLAMA_PORT:-11434}:11434"
10 volumes:
11 - ollama_data:/root/.ollama
12 deploy:
13 resources:
14 reservations:
15 devices:
16 - driver: nvidia
17 count: all
18 capabilities: [gpu]
19
20 open-webui:
21 image: ghcr.io/open-webui/open-webui:main
22 container_name: open-webui
23 restart: unless-stopped
24 ports:
25 - "${WEBUI_PORT:-3000}:8080"
26 environment:
27 - OLLAMA_BASE_URL=http://ollama:11434
28 volumes:
29 - open_webui_data:/app/backend/data
30 depends_on:
31 - ollama
32
33volumes:
34 ollama_data:
35 open_webui_data:
36EOF
37
38# 2. Create the .env file
39cat > .env << 'EOF'
40# Ollama + Open WebUI
41OLLAMA_PORT=11434
42WEBUI_PORT=3000
43EOF
44
45# 3. Start the services
46docker compose up -d
47
48# 4. View logs
49docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/ollama-webui-stack/run | bash

Troubleshooting

  • Open WebUI shows 'Ollama not found' error: Verify Ollama container is running and accessible at http://ollama:11434 internally
  • Models fail to load with out of memory errors: Reduce model size or switch to CPU-only mode by removing the GPU deployment configuration
  • GPU not detected in Ollama container: Install nvidia-container-toolkit on host system and restart Docker daemon
  • Slow inference performance: Enable GPU acceleration or try smaller quantized models like 7B instead of 13B parameters
  • Cannot access Open WebUI interface: Check if port 3000 is available and not blocked by firewall
  • Model download fails or times out: Use docker exec ollama ollama pull command manually and check internet connectivity

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space