Paperless-ngx
Document management system with OCR.
Overview
Paperless-ngx is a modern document management system designed to digitize and organize physical documents through optical character recognition (OCR) and intelligent indexing. Born from the original Paperless project, paperless-ngx adds enhanced features like improved OCR processing, machine learning-based document classification, and a polished web interface that makes going paperless accessible to both home users and small offices. The system transforms scanned documents and PDFs into a searchable digital archive with automatic tagging and metadata extraction.
This Docker stack combines paperless-ngx with PostgreSQL and Redis to create a robust document management platform. PostgreSQL serves as the primary database, storing document metadata, tags, correspondents, and full-text search indexes with ACID compliance and advanced querying capabilities. Redis acts as both a caching layer for improved web interface performance and a task queue for background OCR processing, ensuring responsive user interactions even during heavy document processing workloads.
This configuration is ideal for individuals transitioning to paperless workflows, small businesses needing document archival solutions, and home lab enthusiasts wanting to organize years of accumulated paperwork. The stack provides enterprise-grade document management capabilities without the complexity and cost of commercial solutions, offering features like email document consumption, mobile access, and API integration for custom workflows.
Key Features
- Tesseract OCR engine with multiple language support for text extraction from scanned documents
- Machine learning-powered document classification and automatic tagging based on content patterns
- Full-text search across document contents with PostgreSQL's advanced text search capabilities
- Email consumption for automatic document import from configured email accounts
- Correspondent and document type auto-detection using configurable matching rules
- Web-based document viewer with annotation support and mobile-responsive interface
- Bulk document processing with Redis-powered background task queuing
- REST API for integration with external systems and custom automation workflows
Common Use Cases
- 1Home office digitization for tax documents, receipts, and important paperwork organization
- 2Small business invoice and contract management with automatic vendor recognition
- 3Legal practice document archival with full-text search across case files and correspondence
- 4Property management company tenant file organization and lease document tracking
- 5Medical office patient record digitization with HIPAA-compliant local storage
- 6Academic research paper organization with automatic citation and author extraction
- 7Personal finance management with receipt scanning and expense categorization
Prerequisites
- Minimum 1GB RAM for OCR processing and PostgreSQL operations (2GB+ recommended for heavy usage)
- At least 10GB free disk space for document storage and database growth
- Port 8000 available for paperless-ngx web interface access
- Understanding of document scanning workflows and file organization concepts
- Basic knowledge of environment variable configuration for database credentials
- Scanner or mobile app capability for document digitization input
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 paperless: 3 image: ghcr.io/paperless-ngx/paperless-ngx:latest4 container_name: paperless5 restart: unless-stopped6 environment: 7 PAPERLESS_REDIS: redis://redis:63798 PAPERLESS_DBHOST: postgres9 PAPERLESS_DBNAME: ${DB_NAME}10 PAPERLESS_DBUSER: ${DB_USER}11 PAPERLESS_DBPASS: ${DB_PASSWORD}12 PAPERLESS_ADMIN_USER: ${ADMIN_USER}13 PAPERLESS_ADMIN_PASSWORD: ${ADMIN_PASSWORD}14 volumes: 15 - paperless_data:/usr/src/paperless/data16 - paperless_media:/usr/src/paperless/media17 - paperless_export:/usr/src/paperless/export18 - paperless_consume:/usr/src/paperless/consume19 ports: 20 - "8000:8000"21 depends_on: 22 - postgres23 - redis24 networks: 25 - paperless2627 postgres: 28 image: postgres:16-alpine29 container_name: paperless-postgres30 environment: 31 POSTGRES_DB: ${DB_NAME}32 POSTGRES_USER: ${DB_USER}33 POSTGRES_PASSWORD: ${DB_PASSWORD}34 volumes: 35 - postgres_data:/var/lib/postgresql/data36 networks: 37 - paperless3839 redis: 40 image: redis:alpine41 container_name: paperless-redis42 networks: 43 - paperless4445volumes: 46 paperless_data: 47 paperless_media: 48 paperless_export: 49 paperless_consume: 50 postgres_data: 5152networks: 53 paperless: 54 driver: bridge.env Template
.env
1DB_NAME=paperless2DB_USER=paperless3DB_PASSWORD=changeme4ADMIN_USER=admin5ADMIN_PASSWORD=changemeUsage Notes
- 1Docs: https://docs.paperless-ngx.com/
- 2Access at http://localhost:8000 - login with ADMIN_USER/ADMIN_PASSWORD
- 3Drop/scan files into consume folder for automatic import
- 4OCR extracts text from scanned documents (Tesseract)
- 5Auto-tagging, correspondent detection, document type classification
- 6Mobile app available; email import supported
Individual Services(3 services)
Copy individual services to mix and match with your existing compose files.
paperless
paperless:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
container_name: paperless
restart: unless-stopped
environment:
PAPERLESS_REDIS: redis://redis:6379
PAPERLESS_DBHOST: postgres
PAPERLESS_DBNAME: ${DB_NAME}
PAPERLESS_DBUSER: ${DB_USER}
PAPERLESS_DBPASS: ${DB_PASSWORD}
PAPERLESS_ADMIN_USER: ${ADMIN_USER}
PAPERLESS_ADMIN_PASSWORD: ${ADMIN_PASSWORD}
volumes:
- paperless_data:/usr/src/paperless/data
- paperless_media:/usr/src/paperless/media
- paperless_export:/usr/src/paperless/export
- paperless_consume:/usr/src/paperless/consume
ports:
- "8000:8000"
depends_on:
- postgres
- redis
networks:
- paperless
postgres
postgres:
image: postgres:16-alpine
container_name: paperless-postgres
environment:
POSTGRES_DB: ${DB_NAME}
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
networks:
- paperless
redis
redis:
image: redis:alpine
container_name: paperless-redis
networks:
- paperless
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 paperless:5 image: ghcr.io/paperless-ngx/paperless-ngx:latest6 container_name: paperless7 restart: unless-stopped8 environment:9 PAPERLESS_REDIS: redis://redis:637910 PAPERLESS_DBHOST: postgres11 PAPERLESS_DBNAME: ${DB_NAME}12 PAPERLESS_DBUSER: ${DB_USER}13 PAPERLESS_DBPASS: ${DB_PASSWORD}14 PAPERLESS_ADMIN_USER: ${ADMIN_USER}15 PAPERLESS_ADMIN_PASSWORD: ${ADMIN_PASSWORD}16 volumes:17 - paperless_data:/usr/src/paperless/data18 - paperless_media:/usr/src/paperless/media19 - paperless_export:/usr/src/paperless/export20 - paperless_consume:/usr/src/paperless/consume21 ports:22 - "8000:8000"23 depends_on:24 - postgres25 - redis26 networks:27 - paperless2829 postgres:30 image: postgres:16-alpine31 container_name: paperless-postgres32 environment:33 POSTGRES_DB: ${DB_NAME}34 POSTGRES_USER: ${DB_USER}35 POSTGRES_PASSWORD: ${DB_PASSWORD}36 volumes:37 - postgres_data:/var/lib/postgresql/data38 networks:39 - paperless4041 redis:42 image: redis:alpine43 container_name: paperless-redis44 networks:45 - paperless4647volumes:48 paperless_data:49 paperless_media:50 paperless_export:51 paperless_consume:52 postgres_data:5354networks:55 paperless:56 driver: bridge57EOF5859# 2. Create the .env file60cat > .env << 'EOF'61DB_NAME=paperless62DB_USER=paperless63DB_PASSWORD=changeme64ADMIN_USER=admin65ADMIN_PASSWORD=changeme66EOF6768# 3. Start the services69docker compose up -d7071# 4. View logs72docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/paperless-ngx/run | bashTroubleshooting
- OCR processing stuck or failing: Verify sufficient memory allocation and check Redis connectivity for task queue operations
- Documents not appearing after upload: Check consume folder permissions and monitor paperless container logs for processing errors
- PostgreSQL connection refused: Ensure database container is fully initialized before paperless-ngx starts, add healthcheck dependencies
- Slow document search performance: Monitor PostgreSQL memory usage and consider increasing shared_buffers for better full-text search caching
- Email consumption not working: Verify email server credentials and check firewall rules for IMAP/POP3 connectivity
- Web interface loading slowly: Check Redis memory usage and restart redis container if memory is exhausted from caching
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
paperlesspostgresredis
Tags
#paperless#documents#ocr#archive
Category
Productivity & CollaborationAd Space
Shortcuts: C CopyF FavoriteD Download