Logstash

intermediate

Server-side data processing pipeline.

[i]Overview

Logstash is a server-side data processing pipeline developed by Elastic that ingests data from multiple sources simultaneously, transforms it, and sends it to your favorite stash. Originally created in 2009 by Jordan Sissel, Logstash has become the de facto standard for real-time data collection and ETL operations, capable of processing everything from application logs to metrics and events. Its plugin-based architecture supports over 200 integrations, making it incredibly versatile for heterogeneous data environments. This Logstash deployment creates a centralized data processing hub that can ingest logs from Beats agents, transform unstructured data into structured formats using Grok patterns, and route processed data to various outputs like Elasticsearch, databases, or monitoring systems. The configuration emphasizes flexibility with external pipeline definitions and persistent data storage for reliable processing state management. This setup is ideal for organizations building observability infrastructure, data teams implementing real-time ETL pipelines, and developers who need sophisticated log processing without the complexity of custom streaming applications. The containerized deployment simplifies scaling and maintenance while preserving Logstash's powerful transformation capabilities and extensive plugin ecosystem.

[*]Key Features

[+]Plugin-based architecture with 200+ input, filter, and output plugins
[+]Grok pattern matching for parsing unstructured log data into structured fields
[+]Real-time data transformation with conditional logic and field manipulation
[+]Beats protocol support on port 5044 for lightweight data shipping
[+]HTTP monitoring API on port 9600 for pipeline metrics and health checks
[+]External pipeline configuration directory for hot-reloading processing rules
[+]Memory-efficient event processing with configurable batch sizes
[+]Dead letter queue support for handling processing failures

[#]Common Use Cases

[1]Centralizing application logs from microservices architectures
[2]Building ETL pipelines for streaming data from databases to analytics platforms
[3]Processing web server access logs for security monitoring and analytics
[4]Transforming CSV files and structured data feeds for data warehouses
[5]Collecting and enriching infrastructure metrics from multiple sources
[6]Parsing and correlating security logs from firewalls, IDS, and applications
[7]Processing IoT sensor data streams for real-time monitoring dashboards

[!]Prerequisites

[!]Minimum 2GB RAM allocated to Docker (Logstash JVM requires significant memory)
[!]Port 5044 available for Beats input and port 9600 for monitoring API
[!]Understanding of Logstash pipeline syntax (input, filter, output blocks)
[!]Basic knowledge of Grok patterns for log parsing and field extraction
[!]Local ./pipeline directory with valid .conf files containing pipeline definitions
[!]Java runtime knowledge helpful for JVM tuning and heap size optimization

[!]

WARNING: For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

[$]docker-compose.yml

[docker-compose.yml]

1services: 
2  logstash: 
3    image: docker.elastic.co/logstash/logstash:8.11.0
4    container_name: logstash
5    restart: unless-stopped
6    volumes: 
7      - ./pipeline:/usr/share/logstash/pipeline
8      - logstash_data:/usr/share/logstash/data
9    ports: 
10      - "5044:5044"
11      - "9600:9600"
12
13volumes: 
14  logstash_data:

[$].env Template

[.env]

1# Create pipeline config in ./pipeline

[i]Usage Notes

[1]Docs: https://www.elastic.co/guide/en/logstash/current/
[2]Beats input on port 5044, monitoring API at http://localhost:9600
[3]Pipeline config in ./pipeline/*.conf: input -> filter -> output
[4]Grok patterns for parsing unstructured logs
[5]Part of ELK stack - pairs with Elasticsearch and Kibana
[6]Rich plugin ecosystem for inputs, filters, and outputs

[>]Quick Start

[terminal]

1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4  logstash:
5    image: docker.elastic.co/logstash/logstash:8.11.0
6    container_name: logstash
7    restart: unless-stopped
8    volumes:
9      - ./pipeline:/usr/share/logstash/pipeline
10      - logstash_data:/usr/share/logstash/data
11    ports:
12      - "5044:5044"
13      - "9600:9600"
14
15volumes:
16  logstash_data:
17EOF
18
19# 2. Create the .env file
20cat > .env << 'EOF'
21# Create pipeline config in ./pipeline
22EOF
23
24# 3. Start the services
25docker compose up -d
26
27# 4. View logs
28docker compose logs -f

[>]One-Liner

Run this command to download and set up the recipe in one step:

[terminal]

1curl -fsSL https://docker.recipes/api/recipes/logstash/run | bash

[?]Troubleshooting

[!]Pipeline failed to start: Check ./pipeline/*.conf files for syntax errors using logstash --config.test_and_exit
[!]High memory usage or OOM errors: Tune JVM heap settings via LS_JAVA_OPTS environment variable
[!]Grok parse failures: Test patterns using Kibana Dev Tools or online Grok debugger before deployment
[!]Beats connection refused on port 5044: Verify beats input plugin is configured in pipeline and container port mapping
[!]No data flowing through pipeline: Check input plugin configuration and ensure data sources are sending to correct port
[!]Plugin installation failures: Use logstash-plugin install command inside container or build custom image with required plugins

Community Notes

Loading notes...

## Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

## Components

logstash

## Tags

#logstash#elastic#etl#processing

## Category

Message Queues & Brokers

## Related

Shortcuts: C CopyF FavoriteD Download