Logstash
Server-side data processing pipeline.
Overview
Logstash is a server-side data processing pipeline developed by Elastic that ingests data from multiple sources simultaneously, transforms it, and sends it to your favorite stash. Originally created in 2009 by Jordan Sissel, Logstash has become the de facto standard for real-time data collection and ETL operations, capable of processing everything from application logs to metrics and events. Its plugin-based architecture supports over 200 integrations, making it incredibly versatile for heterogeneous data environments.
This Logstash deployment creates a centralized data processing hub that can ingest logs from Beats agents, transform unstructured data into structured formats using Grok patterns, and route processed data to various outputs like Elasticsearch, databases, or monitoring systems. The configuration emphasizes flexibility with external pipeline definitions and persistent data storage for reliable processing state management.
This setup is ideal for organizations building observability infrastructure, data teams implementing real-time ETL pipelines, and developers who need sophisticated log processing without the complexity of custom streaming applications. The containerized deployment simplifies scaling and maintenance while preserving Logstash's powerful transformation capabilities and extensive plugin ecosystem.
Key Features
- Plugin-based architecture with 200+ input, filter, and output plugins
- Grok pattern matching for parsing unstructured log data into structured fields
- Real-time data transformation with conditional logic and field manipulation
- Beats protocol support on port 5044 for lightweight data shipping
- HTTP monitoring API on port 9600 for pipeline metrics and health checks
- External pipeline configuration directory for hot-reloading processing rules
- Memory-efficient event processing with configurable batch sizes
- Dead letter queue support for handling processing failures
Common Use Cases
- 1Centralizing application logs from microservices architectures
- 2Building ETL pipelines for streaming data from databases to analytics platforms
- 3Processing web server access logs for security monitoring and analytics
- 4Transforming CSV files and structured data feeds for data warehouses
- 5Collecting and enriching infrastructure metrics from multiple sources
- 6Parsing and correlating security logs from firewalls, IDS, and applications
- 7Processing IoT sensor data streams for real-time monitoring dashboards
Prerequisites
- Minimum 2GB RAM allocated to Docker (Logstash JVM requires significant memory)
- Port 5044 available for Beats input and port 9600 for monitoring API
- Understanding of Logstash pipeline syntax (input, filter, output blocks)
- Basic knowledge of Grok patterns for log parsing and field extraction
- Local ./pipeline directory with valid .conf files containing pipeline definitions
- Java runtime knowledge helpful for JVM tuning and heap size optimization
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 logstash: 3 image: docker.elastic.co/logstash/logstash:8.11.04 container_name: logstash5 restart: unless-stopped6 volumes: 7 - ./pipeline:/usr/share/logstash/pipeline8 - logstash_data:/usr/share/logstash/data9 ports: 10 - "5044:5044"11 - "9600:9600"1213volumes: 14 logstash_data: .env Template
.env
1# Create pipeline config in ./pipelineUsage Notes
- 1Docs: https://www.elastic.co/guide/en/logstash/current/
- 2Beats input on port 5044, monitoring API at http://localhost:9600
- 3Pipeline config in ./pipeline/*.conf: input -> filter -> output
- 4Grok patterns for parsing unstructured logs
- 5Part of ELK stack - pairs with Elasticsearch and Kibana
- 6Rich plugin ecosystem for inputs, filters, and outputs
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 logstash:5 image: docker.elastic.co/logstash/logstash:8.11.06 container_name: logstash7 restart: unless-stopped8 volumes:9 - ./pipeline:/usr/share/logstash/pipeline10 - logstash_data:/usr/share/logstash/data11 ports:12 - "5044:5044"13 - "9600:9600"1415volumes:16 logstash_data:17EOF1819# 2. Create the .env file20cat > .env << 'EOF'21# Create pipeline config in ./pipeline22EOF2324# 3. Start the services25docker compose up -d2627# 4. View logs28docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/logstash/run | bashTroubleshooting
- Pipeline failed to start: Check ./pipeline/*.conf files for syntax errors using logstash --config.test_and_exit
- High memory usage or OOM errors: Tune JVM heap settings via LS_JAVA_OPTS environment variable
- Grok parse failures: Test patterns using Kibana Dev Tools or online Grok debugger before deployment
- Beats connection refused on port 5044: Verify beats input plugin is configured in pipeline and container port mapping
- No data flowing through pipeline: Check input plugin configuration and ensure data sources are sending to correct port
- Plugin installation failures: Use logstash-plugin install command inside container or build custom image with required plugins
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Ad Space
Shortcuts: C CopyF FavoriteD Download