Docspell Document Management
Personal document organizer with OCR and full-text search.
Overview
Docspell is an open-source personal document management system designed to organize, process, and search through documents using advanced OCR and machine learning capabilities. Built with Scala and developed as a privacy-focused alternative to cloud-based document solutions, Docspell excels at automatically extracting metadata from documents, recognizing correspondents, and categorizing content through intelligent tagging. The system processes various document formats including PDFs, images, and office documents, making them fully searchable and organized.
This deployment creates a complete four-service Docspell environment with the restserver handling the web interface and API operations, joex managing background processing tasks like OCR and document analysis, a PostgreSQL database for metadata storage, and Apache Solr providing powerful full-text search capabilities. The restserver acts as the user-facing component on port 7880, while joex operates behind the scenes to process uploaded documents, extract text through OCR, and perform automated classification tasks.
This configuration is ideal for individuals, small businesses, or teams who need to digitize and organize large volumes of paper documents, automate invoice processing, or create searchable archives of correspondence. The combination of PostgreSQL's robust data integrity with Solr's advanced search features provides both reliable storage and lightning-fast document retrieval, making it particularly valuable for legal offices, accounting firms, or anyone dealing with document-heavy workflows.
Key Features
- Advanced OCR processing with automatic text extraction from scanned documents and images
- Machine learning-powered automatic tagging and correspondent recognition
- Full-text search capabilities powered by Apache Solr with highlighting and relevance ranking
- Multi-format document support including PDF, DOCX, images, and email attachments
- Automated document classification with customizable rules and patterns
- RESTful API for integration with external systems and workflow automation
- Multi-user support with role-based access control and organization management
- Background job processing system for handling large document batches without blocking the interface
Common Use Cases
- 1Small business invoice and receipt management with automatic vendor recognition
- 2Legal document archiving with advanced search and metadata extraction
- 3Personal paperwork organization for tax documents, contracts, and correspondence
- 4Academic research paper collection with full-text search and citation management
- 5Real estate document management for property records and transaction history
- 6Medical practice patient record digitization and HIPAA-compliant storage
- 7Non-profit organization grant application and compliance document tracking
Prerequisites
- Docker and Docker Compose installed with at least 2GB available RAM for optimal OCR processing
- Port 7880 available for Docspell web interface access
- Environment variables configured for DB_PASSWORD, ADMIN_SECRET, and AUTH_SECRET
- Basic understanding of document management workflows and OCR limitations
- Sufficient disk space for document storage and PostgreSQL/Solr data growth
- Network access for downloading OCR models and language packs during initial setup
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 restserver: 3 image: docspell/restserver:latest4 container_name: docspell-restserver5 environment: 6 - DOCSPELL_SERVER_INTERNAL__URL=http://restserver:78807 - DOCSPELL_SERVER_ADMIN__ENDPOINT__SECRET=${ADMIN_SECRET}8 - DOCSPELL_SERVER_AUTH__SERVER__SECRET=${AUTH_SECRET}9 - DOCSPELL_SERVER_BACKEND__JDBC__URL=jdbc:postgresql://db:5432/docspell10 - DOCSPELL_SERVER_BACKEND__JDBC__USER=docspell11 - DOCSPELL_SERVER_BACKEND__JDBC__PASSWORD=${DB_PASSWORD}12 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__ENABLED=true13 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell14 ports: 15 - "7880:7880"16 depends_on: 17 - db18 - solr19 networks: 20 - docspell-network21 restart: unless-stopped2223 joex: 24 image: docspell/joex:latest25 container_name: docspell-joex26 environment: 27 - DOCSPELL_JOEX_APP__ID=joex128 - DOCSPELL_JOEX_BASE__URL=http://joex:787829 - DOCSPELL_JOEX_JDBC__URL=jdbc:postgresql://db:5432/docspell30 - DOCSPELL_JOEX_JDBC__USER=docspell31 - DOCSPELL_JOEX_JDBC__PASSWORD=${DB_PASSWORD}32 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__ENABLED=true33 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell34 depends_on: 35 - db36 - solr37 networks: 38 - docspell-network39 restart: unless-stopped4041 db: 42 image: postgres:15-alpine43 container_name: docspell-db44 environment: 45 - POSTGRES_USER=docspell46 - POSTGRES_PASSWORD=${DB_PASSWORD}47 - POSTGRES_DB=docspell48 volumes: 49 - postgres-data:/var/lib/postgresql/data50 networks: 51 - docspell-network52 restart: unless-stopped5354 solr: 55 image: solr:956 container_name: docspell-solr57 command: solr-precreate docspell58 volumes: 59 - solr-data:/var/solr60 networks: 61 - docspell-network62 restart: unless-stopped6364volumes: 65 postgres-data: 66 solr-data: 6768networks: 69 docspell-network: 70 driver: bridge.env Template
.env
1# Docspell2DB_PASSWORD=secure_docspell_password34# Generate with: openssl rand -hex 325ADMIN_SECRET=your_admin_secret6AUTH_SECRET=your_auth_secretUsage Notes
- 1Web UI at http://localhost:7880
- 2Register first account
- 3OCR for scanned documents
- 4Full-text search with Solr
- 5Automatic tagging and organization
Individual Services(4 services)
Copy individual services to mix and match with your existing compose files.
restserver
restserver:
image: docspell/restserver:latest
container_name: docspell-restserver
environment:
- DOCSPELL_SERVER_INTERNAL__URL=http://restserver:7880
- DOCSPELL_SERVER_ADMIN__ENDPOINT__SECRET=${ADMIN_SECRET}
- DOCSPELL_SERVER_AUTH__SERVER__SECRET=${AUTH_SECRET}
- DOCSPELL_SERVER_BACKEND__JDBC__URL=jdbc:postgresql://db:5432/docspell
- DOCSPELL_SERVER_BACKEND__JDBC__USER=docspell
- DOCSPELL_SERVER_BACKEND__JDBC__PASSWORD=${DB_PASSWORD}
- DOCSPELL_SERVER_FULL__TEXT__SEARCH__ENABLED=true
- DOCSPELL_SERVER_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
ports:
- "7880:7880"
depends_on:
- db
- solr
networks:
- docspell-network
restart: unless-stopped
joex
joex:
image: docspell/joex:latest
container_name: docspell-joex
environment:
- DOCSPELL_JOEX_APP__ID=joex1
- DOCSPELL_JOEX_BASE__URL=http://joex:7878
- DOCSPELL_JOEX_JDBC__URL=jdbc:postgresql://db:5432/docspell
- DOCSPELL_JOEX_JDBC__USER=docspell
- DOCSPELL_JOEX_JDBC__PASSWORD=${DB_PASSWORD}
- DOCSPELL_JOEX_FULL__TEXT__SEARCH__ENABLED=true
- DOCSPELL_JOEX_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell
depends_on:
- db
- solr
networks:
- docspell-network
restart: unless-stopped
db
db:
image: postgres:15-alpine
container_name: docspell-db
environment:
- POSTGRES_USER=docspell
- POSTGRES_PASSWORD=${DB_PASSWORD}
- POSTGRES_DB=docspell
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- docspell-network
restart: unless-stopped
solr
solr:
image: solr:9
container_name: docspell-solr
command: solr-precreate docspell
volumes:
- solr-data:/var/solr
networks:
- docspell-network
restart: unless-stopped
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 restserver:5 image: docspell/restserver:latest6 container_name: docspell-restserver7 environment:8 - DOCSPELL_SERVER_INTERNAL__URL=http://restserver:78809 - DOCSPELL_SERVER_ADMIN__ENDPOINT__SECRET=${ADMIN_SECRET}10 - DOCSPELL_SERVER_AUTH__SERVER__SECRET=${AUTH_SECRET}11 - DOCSPELL_SERVER_BACKEND__JDBC__URL=jdbc:postgresql://db:5432/docspell12 - DOCSPELL_SERVER_BACKEND__JDBC__USER=docspell13 - DOCSPELL_SERVER_BACKEND__JDBC__PASSWORD=${DB_PASSWORD}14 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__ENABLED=true15 - DOCSPELL_SERVER_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell16 ports:17 - "7880:7880"18 depends_on:19 - db20 - solr21 networks:22 - docspell-network23 restart: unless-stopped2425 joex:26 image: docspell/joex:latest27 container_name: docspell-joex28 environment:29 - DOCSPELL_JOEX_APP__ID=joex130 - DOCSPELL_JOEX_BASE__URL=http://joex:787831 - DOCSPELL_JOEX_JDBC__URL=jdbc:postgresql://db:5432/docspell32 - DOCSPELL_JOEX_JDBC__USER=docspell33 - DOCSPELL_JOEX_JDBC__PASSWORD=${DB_PASSWORD}34 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__ENABLED=true35 - DOCSPELL_JOEX_FULL__TEXT__SEARCH__SOLR__URL=http://solr:8983/solr/docspell36 depends_on:37 - db38 - solr39 networks:40 - docspell-network41 restart: unless-stopped4243 db:44 image: postgres:15-alpine45 container_name: docspell-db46 environment:47 - POSTGRES_USER=docspell48 - POSTGRES_PASSWORD=${DB_PASSWORD}49 - POSTGRES_DB=docspell50 volumes:51 - postgres-data:/var/lib/postgresql/data52 networks:53 - docspell-network54 restart: unless-stopped5556 solr:57 image: solr:958 container_name: docspell-solr59 command: solr-precreate docspell60 volumes:61 - solr-data:/var/solr62 networks:63 - docspell-network64 restart: unless-stopped6566volumes:67 postgres-data:68 solr-data:6970networks:71 docspell-network:72 driver: bridge73EOF7475# 2. Create the .env file76cat > .env << 'EOF'77# Docspell78DB_PASSWORD=secure_docspell_password7980# Generate with: openssl rand -hex 3281ADMIN_SECRET=your_admin_secret82AUTH_SECRET=your_auth_secret83EOF8485# 3. Start the services86docker compose up -d8788# 4. View logs89docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/docspell-dms/run | bashTroubleshooting
- OCR processing fails or produces poor results: Ensure uploaded documents have sufficient resolution (300+ DPI) and check joex container logs for tessaract errors
- Solr search returns no results for existing documents: Verify the Solr core 'docspell' was created properly and check full-text search is enabled in both restserver and joex configurations
- Database connection errors in restserver logs: Confirm PostgreSQL container is fully started and DB_PASSWORD environment variable matches between db and restserver services
- Documents stuck in processing queue: Check joex container resources and restart the joex service, as OCR operations can be memory-intensive
- Login fails after initial setup: Verify ADMIN_SECRET and AUTH_SECRET are properly set and consistent across restarts, check restserver logs for authentication errors
- File uploads fail or timeout: Increase Docker container memory limits and check available disk space in the postgres-data and solr-data volumes
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
docspell-restserverdocspell-joexsolrpostgresql
Tags
#document-management#docspell#ocr#organization#dms
Category
Productivity & CollaborationAd Space
Shortcuts: C CopyF FavoriteD Download