docker.recipes

Docspell Document Management

intermediate

Personal document organizer with OCR and full-text search.

Overview

Docspell is an open-source personal document management system designed to organize, process, and search through documents using advanced OCR and machine learning capabilities. Built with Scala and developed as a privacy-focused alternative to cloud-based document solutions, Docspell excels at automatically extracting metadata from documents, recognizing correspondents, and categorizing content through intelligent tagging. The system processes various document formats including PDFs, images, and office documents, making them fully searchable and organized. This deployment creates a complete four-service Docspell environment with the restserver handling the web interface and API operations, joex managing background processing tasks like OCR and document analysis, a PostgreSQL database for metadata storage, and Apache Solr providing powerful full-text search capabilities. The restserver acts as the user-facing component on port 7880, while joex operates behind the scenes to process uploaded documents, extract text through OCR, and perform automated classification tasks. This configuration is ideal for individuals, small businesses, or teams who need to digitize and organize large volumes of paper documents, automate invoice processing, or create searchable archives of correspondence. The combination of PostgreSQL's robust data integrity with Solr's advanced search features provides both reliable storage and lightning-fast document retrieval, making it particularly valuable for legal offices, accounting firms, or anyone dealing with document-heavy workflows.

Key Features

  • Advanced OCR processing with automatic text extraction from scanned documents and images
  • Machine learning-powered automatic tagging and correspondent recognition
  • Full-text search capabilities powered by Apache Solr with highlighting and relevance ranking
  • Multi-format document support including PDF, DOCX, images, and email attachments
  • Automated document classification with customizable rules and patterns
  • RESTful API for integration with external systems and workflow automation
  • Multi-user support with role-based access control and organization management
  • Background job processing system for handling large document batches without blocking the interface

Common Use Cases

  • 1Small business invoice and receipt management with automatic vendor recognition
  • 2Legal document archiving with advanced search and metadata extraction
  • 3Personal paperwork organization for tax documents, contracts, and correspondence
  • 4Academic research paper collection with full-text search and citation management
  • 5Real estate document management for property records and transaction history
  • 6Medical practice patient record digitization and HIPAA-compliant storage
  • 7Non-profit organization grant application and compliance document tracking

Prerequisites

  • Docker and Docker Compose installed with at least 2GB available RAM for optimal OCR processing
  • Port 7880 available for Docspell web interface access
  • Environment variables configured for DB_PASSWORD, ADMIN_SECRET, and AUTH_SECRET
  • Basic understanding of document management workflows and OCR limitations
  • Sufficient disk space for document storage and PostgreSQL/Solr data growth
  • Network access for downloading OCR models and language packs during initial setup

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 restserver:
3 image: docspell/restserver:latest
4 container_name: docspell-restserver
5 environment:
6 - DOCSPELL_SERVER_INTERNAL__URL=http://restserver:7880
7 - DOCSPELL_SERVER_ADMIN__ENDPOINT__SECRET=${ADMIN_SECRET}
8 - DOCSPELL_SERVER_AUTH__SERVER__SECRET=${AUTH_SECRET}
9 - DOCSPELL_SERVER_BACKEND__JDBC__URL=jdbc:postgresql://db:5432/docspell
10 - DOCSPELL_SERVER_BACKEND__JDBC__USER=docspell
11 - DOCSPELL_SERVER_BACKEND__JDBC__PASSWORD=${DB_PASSWORD}
12 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__ENABLED=true
13 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
14 ports:
15 - "7880:7880"
16 depends_on:
17 - db
18 - solr
19 networks:
20 - docspell-network
21 restart: unless-stopped
22
23 joex:
24 image: docspell/joex:latest
25 container_name: docspell-joex
26 environment:
27 - DOCSPELL_JOEX_APP__ID=joex1
28 - DOCSPELL_JOEX_BASE__URL=http://joex:7878
29 - DOCSPELL_JOEX_JDBC__URL=jdbc:postgresql://db:5432/docspell
30 - DOCSPELL_JOEX_JDBC__USER=docspell
31 - DOCSPELL_JOEX_JDBC__PASSWORD=${DB_PASSWORD}
32 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__ENABLED=true
33 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
34 depends_on:
35 - db
36 - solr
37 networks:
38 - docspell-network
39 restart: unless-stopped
40
41 db:
42 image: postgres:15-alpine
43 container_name: docspell-db
44 environment:
45 - POSTGRES_USER=docspell
46 - POSTGRES_PASSWORD=${DB_PASSWORD}
47 - POSTGRES_DB=docspell
48 volumes:
49 - postgres-data:/var/lib/postgresql/data
50 networks:
51 - docspell-network
52 restart: unless-stopped
53
54 solr:
55 image: solr:9
56 container_name: docspell-solr
57 command: solr-precreate docspell
58 volumes:
59 - solr-data:/var/solr
60 networks:
61 - docspell-network
62 restart: unless-stopped
63
64volumes:
65 postgres-data:
66 solr-data:
67
68networks:
69 docspell-network:
70 driver: bridge

.env Template

.env
1# Docspell
2DB_PASSWORD=secure_docspell_password
3
4# Generate with: openssl rand -hex 32
5ADMIN_SECRET=your_admin_secret
6AUTH_SECRET=your_auth_secret

Usage Notes

  1. 1Web UI at http://localhost:7880
  2. 2Register first account
  3. 3OCR for scanned documents
  4. 4Full-text search with Solr
  5. 5Automatic tagging and organization

Individual Services(4 services)

Copy individual services to mix and match with your existing compose files.

restserver
restserver:
  image: docspell/restserver:latest
  container_name: docspell-restserver
  environment:
    - DOCSPELL_SERVER_INTERNAL__URL=http://restserver:7880
    - DOCSPELL_SERVER_ADMIN__ENDPOINT__SECRET=${ADMIN_SECRET}
    - DOCSPELL_SERVER_AUTH__SERVER__SECRET=${AUTH_SECRET}
    - DOCSPELL_SERVER_BACKEND__JDBC__URL=jdbc:postgresql://db:5432/docspell
    - DOCSPELL_SERVER_BACKEND__JDBC__USER=docspell
    - DOCSPELL_SERVER_BACKEND__JDBC__PASSWORD=${DB_PASSWORD}
    - DOCSPELL_SERVER_FULL__TEXT__SEARCH__ENABLED=true
    - DOCSPELL_SERVER_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
  ports:
    - "7880:7880"
  depends_on:
    - db
    - solr
  networks:
    - docspell-network
  restart: unless-stopped
joex
joex:
  image: docspell/joex:latest
  container_name: docspell-joex
  environment:
    - DOCSPELL_JOEX_APP__ID=joex1
    - DOCSPELL_JOEX_BASE__URL=http://joex:7878
    - DOCSPELL_JOEX_JDBC__URL=jdbc:postgresql://db:5432/docspell
    - DOCSPELL_JOEX_JDBC__USER=docspell
    - DOCSPELL_JOEX_JDBC__PASSWORD=${DB_PASSWORD}
    - DOCSPELL_JOEX_FULL__TEXT__SEARCH__ENABLED=true
    - DOCSPELL_JOEX_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
  depends_on:
    - db
    - solr
  networks:
    - docspell-network
  restart: unless-stopped
db
db:
  image: postgres:15-alpine
  container_name: docspell-db
  environment:
    - POSTGRES_USER=docspell
    - POSTGRES_PASSWORD=${DB_PASSWORD}
    - POSTGRES_DB=docspell
  volumes:
    - postgres-data:/var/lib/postgresql/data
  networks:
    - docspell-network
  restart: unless-stopped
solr
solr:
  image: solr:9
  container_name: docspell-solr
  command: solr-precreate docspell
  volumes:
    - solr-data:/var/solr
  networks:
    - docspell-network
  restart: unless-stopped

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 restserver:
5 image: docspell/restserver:latest
6 container_name: docspell-restserver
7 environment:
8 - DOCSPELL_SERVER_INTERNAL__URL=http://restserver:7880
9 - DOCSPELL_SERVER_ADMIN__ENDPOINT__SECRET=${ADMIN_SECRET}
10 - DOCSPELL_SERVER_AUTH__SERVER__SECRET=${AUTH_SECRET}
11 - DOCSPELL_SERVER_BACKEND__JDBC__URL=jdbc:postgresql://db:5432/docspell
12 - DOCSPELL_SERVER_BACKEND__JDBC__USER=docspell
13 - DOCSPELL_SERVER_BACKEND__JDBC__PASSWORD=${DB_PASSWORD}
14 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__ENABLED=true
15 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
16 ports:
17 - "7880:7880"
18 depends_on:
19 - db
20 - solr
21 networks:
22 - docspell-network
23 restart: unless-stopped
24
25 joex:
26 image: docspell/joex:latest
27 container_name: docspell-joex
28 environment:
29 - DOCSPELL_JOEX_APP__ID=joex1
30 - DOCSPELL_JOEX_BASE__URL=http://joex:7878
31 - DOCSPELL_JOEX_JDBC__URL=jdbc:postgresql://db:5432/docspell
32 - DOCSPELL_JOEX_JDBC__USER=docspell
33 - DOCSPELL_JOEX_JDBC__PASSWORD=${DB_PASSWORD}
34 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__ENABLED=true
35 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
36 depends_on:
37 - db
38 - solr
39 networks:
40 - docspell-network
41 restart: unless-stopped
42
43 db:
44 image: postgres:15-alpine
45 container_name: docspell-db
46 environment:
47 - POSTGRES_USER=docspell
48 - POSTGRES_PASSWORD=${DB_PASSWORD}
49 - POSTGRES_DB=docspell
50 volumes:
51 - postgres-data:/var/lib/postgresql/data
52 networks:
53 - docspell-network
54 restart: unless-stopped
55
56 solr:
57 image: solr:9
58 container_name: docspell-solr
59 command: solr-precreate docspell
60 volumes:
61 - solr-data:/var/solr
62 networks:
63 - docspell-network
64 restart: unless-stopped
65
66volumes:
67 postgres-data:
68 solr-data:
69
70networks:
71 docspell-network:
72 driver: bridge
73EOF
74
75# 2. Create the .env file
76cat > .env << 'EOF'
77# Docspell
78DB_PASSWORD=secure_docspell_password
79
80# Generate with: openssl rand -hex 32
81ADMIN_SECRET=your_admin_secret
82AUTH_SECRET=your_auth_secret
83EOF
84
85# 3. Start the services
86docker compose up -d
87
88# 4. View logs
89docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/docspell-dms/run | bash

Troubleshooting

  • OCR processing fails or produces poor results: Ensure uploaded documents have sufficient resolution (300+ DPI) and check joex container logs for tessaract errors
  • Solr search returns no results for existing documents: Verify the Solr core 'docspell' was created properly and check full-text search is enabled in both restserver and joex configurations
  • Database connection errors in restserver logs: Confirm PostgreSQL container is fully started and DB_PASSWORD environment variable matches between db and restserver services
  • Documents stuck in processing queue: Check joex container resources and restart the joex service, as OCR operations can be memory-intensive
  • Login fails after initial setup: Verify ADMIN_SECRET and AUTH_SECRET are properly set and consistent across restarts, check restserver logs for authentication errors
  • File uploads fail or timeout: Increase Docker container memory limits and check available disk space in the postgres-data and solr-data volumes

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space