docker.recipes

Paperless-ngx

intermediate

Document management system with OCR.

Overview

Paperless-ngx is a modern document management system designed to digitize and organize physical documents through optical character recognition (OCR) and intelligent indexing. Born from the original Paperless project, paperless-ngx adds enhanced features like improved OCR processing, machine learning-based document classification, and a polished web interface that makes going paperless accessible to both home users and small offices. The system transforms scanned documents and PDFs into a searchable digital archive with automatic tagging and metadata extraction. This Docker stack combines paperless-ngx with PostgreSQL and Redis to create a robust document management platform. PostgreSQL serves as the primary database, storing document metadata, tags, correspondents, and full-text search indexes with ACID compliance and advanced querying capabilities. Redis acts as both a caching layer for improved web interface performance and a task queue for background OCR processing, ensuring responsive user interactions even during heavy document processing workloads. This configuration is ideal for individuals transitioning to paperless workflows, small businesses needing document archival solutions, and home lab enthusiasts wanting to organize years of accumulated paperwork. The stack provides enterprise-grade document management capabilities without the complexity and cost of commercial solutions, offering features like email document consumption, mobile access, and API integration for custom workflows.

Key Features

  • Tesseract OCR engine with multiple language support for text extraction from scanned documents
  • Machine learning-powered document classification and automatic tagging based on content patterns
  • Full-text search across document contents with PostgreSQL's advanced text search capabilities
  • Email consumption for automatic document import from configured email accounts
  • Correspondent and document type auto-detection using configurable matching rules
  • Web-based document viewer with annotation support and mobile-responsive interface
  • Bulk document processing with Redis-powered background task queuing
  • REST API for integration with external systems and custom automation workflows

Common Use Cases

  • 1Home office digitization for tax documents, receipts, and important paperwork organization
  • 2Small business invoice and contract management with automatic vendor recognition
  • 3Legal practice document archival with full-text search across case files and correspondence
  • 4Property management company tenant file organization and lease document tracking
  • 5Medical office patient record digitization with HIPAA-compliant local storage
  • 6Academic research paper organization with automatic citation and author extraction
  • 7Personal finance management with receipt scanning and expense categorization

Prerequisites

  • Minimum 1GB RAM for OCR processing and PostgreSQL operations (2GB+ recommended for heavy usage)
  • At least 10GB free disk space for document storage and database growth
  • Port 8000 available for paperless-ngx web interface access
  • Understanding of document scanning workflows and file organization concepts
  • Basic knowledge of environment variable configuration for database credentials
  • Scanner or mobile app capability for document digitization input

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 paperless:
3 image: ghcr.io/paperless-ngx/paperless-ngx:latest
4 container_name: paperless
5 restart: unless-stopped
6 environment:
7 PAPERLESS_REDIS: redis://redis:6379
8 PAPERLESS_DBHOST: postgres
9 PAPERLESS_DBNAME: ${DB_NAME}
10 PAPERLESS_DBUSER: ${DB_USER}
11 PAPERLESS_DBPASS: ${DB_PASSWORD}
12 PAPERLESS_ADMIN_USER: ${ADMIN_USER}
13 PAPERLESS_ADMIN_PASSWORD: ${ADMIN_PASSWORD}
14 volumes:
15 - paperless_data:/usr/src/paperless/data
16 - paperless_media:/usr/src/paperless/media
17 - paperless_export:/usr/src/paperless/export
18 - paperless_consume:/usr/src/paperless/consume
19 ports:
20 - "8000:8000"
21 depends_on:
22 - postgres
23 - redis
24 networks:
25 - paperless
26
27 postgres:
28 image: postgres:16-alpine
29 container_name: paperless-postgres
30 environment:
31 POSTGRES_DB: ${DB_NAME}
32 POSTGRES_USER: ${DB_USER}
33 POSTGRES_PASSWORD: ${DB_PASSWORD}
34 volumes:
35 - postgres_data:/var/lib/postgresql/data
36 networks:
37 - paperless
38
39 redis:
40 image: redis:alpine
41 container_name: paperless-redis
42 networks:
43 - paperless
44
45volumes:
46 paperless_data:
47 paperless_media:
48 paperless_export:
49 paperless_consume:
50 postgres_data:
51
52networks:
53 paperless:
54 driver: bridge

.env Template

.env
1DB_NAME=paperless
2DB_USER=paperless
3DB_PASSWORD=changeme
4ADMIN_USER=admin
5ADMIN_PASSWORD=changeme

Usage Notes

  1. 1Docs: https://docs.paperless-ngx.com/
  2. 2Access at http://localhost:8000 - login with ADMIN_USER/ADMIN_PASSWORD
  3. 3Drop/scan files into consume folder for automatic import
  4. 4OCR extracts text from scanned documents (Tesseract)
  5. 5Auto-tagging, correspondent detection, document type classification
  6. 6Mobile app available; email import supported

Individual Services(3 services)

Copy individual services to mix and match with your existing compose files.

paperless
paperless:
  image: ghcr.io/paperless-ngx/paperless-ngx:latest
  container_name: paperless
  restart: unless-stopped
  environment:
    PAPERLESS_REDIS: redis://redis:6379
    PAPERLESS_DBHOST: postgres
    PAPERLESS_DBNAME: ${DB_NAME}
    PAPERLESS_DBUSER: ${DB_USER}
    PAPERLESS_DBPASS: ${DB_PASSWORD}
    PAPERLESS_ADMIN_USER: ${ADMIN_USER}
    PAPERLESS_ADMIN_PASSWORD: ${ADMIN_PASSWORD}
  volumes:
    - paperless_data:/usr/src/paperless/data
    - paperless_media:/usr/src/paperless/media
    - paperless_export:/usr/src/paperless/export
    - paperless_consume:/usr/src/paperless/consume
  ports:
    - "8000:8000"
  depends_on:
    - postgres
    - redis
  networks:
    - paperless
postgres
postgres:
  image: postgres:16-alpine
  container_name: paperless-postgres
  environment:
    POSTGRES_DB: ${DB_NAME}
    POSTGRES_USER: ${DB_USER}
    POSTGRES_PASSWORD: ${DB_PASSWORD}
  volumes:
    - postgres_data:/var/lib/postgresql/data
  networks:
    - paperless
redis
redis:
  image: redis:alpine
  container_name: paperless-redis
  networks:
    - paperless

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 paperless:
5 image: ghcr.io/paperless-ngx/paperless-ngx:latest
6 container_name: paperless
7 restart: unless-stopped
8 environment:
9 PAPERLESS_REDIS: redis://redis:6379
10 PAPERLESS_DBHOST: postgres
11 PAPERLESS_DBNAME: ${DB_NAME}
12 PAPERLESS_DBUSER: ${DB_USER}
13 PAPERLESS_DBPASS: ${DB_PASSWORD}
14 PAPERLESS_ADMIN_USER: ${ADMIN_USER}
15 PAPERLESS_ADMIN_PASSWORD: ${ADMIN_PASSWORD}
16 volumes:
17 - paperless_data:/usr/src/paperless/data
18 - paperless_media:/usr/src/paperless/media
19 - paperless_export:/usr/src/paperless/export
20 - paperless_consume:/usr/src/paperless/consume
21 ports:
22 - "8000:8000"
23 depends_on:
24 - postgres
25 - redis
26 networks:
27 - paperless
28
29 postgres:
30 image: postgres:16-alpine
31 container_name: paperless-postgres
32 environment:
33 POSTGRES_DB: ${DB_NAME}
34 POSTGRES_USER: ${DB_USER}
35 POSTGRES_PASSWORD: ${DB_PASSWORD}
36 volumes:
37 - postgres_data:/var/lib/postgresql/data
38 networks:
39 - paperless
40
41 redis:
42 image: redis:alpine
43 container_name: paperless-redis
44 networks:
45 - paperless
46
47volumes:
48 paperless_data:
49 paperless_media:
50 paperless_export:
51 paperless_consume:
52 postgres_data:
53
54networks:
55 paperless:
56 driver: bridge
57EOF
58
59# 2. Create the .env file
60cat > .env << 'EOF'
61DB_NAME=paperless
62DB_USER=paperless
63DB_PASSWORD=changeme
64ADMIN_USER=admin
65ADMIN_PASSWORD=changeme
66EOF
67
68# 3. Start the services
69docker compose up -d
70
71# 4. View logs
72docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/paperless-ngx/run | bash

Troubleshooting

  • OCR processing stuck or failing: Verify sufficient memory allocation and check Redis connectivity for task queue operations
  • Documents not appearing after upload: Check consume folder permissions and monitor paperless container logs for processing errors
  • PostgreSQL connection refused: Ensure database container is fully initialized before paperless-ngx starts, add healthcheck dependencies
  • Slow document search performance: Monitor PostgreSQL memory usage and consider increasing shared_buffers for better full-text search caching
  • Email consumption not working: Verify email server credentials and check firewall rules for IMAP/POP3 connectivity
  • Web interface loading slowly: Check Redis memory usage and restart redis container if memory is exhausted from caching

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space