docker.recipes

Benthos

intermediate

Stream processing with declarative config.

Overview

Benthos is a high-performance stream processing framework developed by Ashley Jeffs that enables real-time data transformation and routing through declarative YAML configuration. Originally created to solve complex ETL challenges at companies dealing with massive data volumes, Benthos has evolved into a versatile tool that can handle everything from simple message routing to complex data pipelines with sophisticated transformation logic. The framework supports over 60 input and output connectors including Kafka, AWS services, Google Cloud Platform, various databases, HTTP endpoints, and file systems, making it exceptionally flexible for integration scenarios. This Docker deployment creates a containerized Benthos instance that processes streaming data according to pipeline definitions specified in a configuration file. The setup exposes Benthos's built-in HTTP API for monitoring and management, while the core processing engine handles the input-processor-output pipeline flow defined in your benthos.yaml configuration. The architecture allows for hot-reloading of configurations and provides comprehensive metrics and health endpoints for operational monitoring. This stack is ideal for DevOps teams implementing real-time data pipelines, data engineers building ETL workflows, and organizations needing reliable message transformation between different systems. Benthos excels in scenarios requiring high throughput, low latency processing, and complex data transformation using its built-in Bloblang language, making it particularly valuable for companies dealing with event streaming, data synchronization, and real-time analytics workflows.

Key Features

  • Built-in Bloblang expression language for complex data transformation and conditional processing
  • Over 60 native connectors for popular systems including Kafka, Redis, PostgreSQL, AWS S3, and Google Pub/Sub
  • Hot-reloading configuration changes without service restart for zero-downtime pipeline updates
  • Comprehensive HTTP API with real-time metrics, health checks, and pipeline status monitoring at port 4195
  • Memory-efficient single binary architecture with minimal resource footprint and fast startup times
  • Advanced error handling with configurable retry policies, dead letter queues, and circuit breaker patterns
  • Built-in rate limiting, batching, and parallel processing capabilities for optimal throughput control
  • Native support for structured logging with detailed pipeline execution tracing and debugging information

Common Use Cases

  • 1Real-time ETL pipelines transforming data between different cloud platforms and databases
  • 2Message routing and transformation in event-driven microservices architectures
  • 3Log aggregation and enrichment from multiple sources before shipping to centralized logging systems
  • 4API data synchronization between legacy systems and modern cloud-native applications
  • 5Stream processing for IoT sensor data with filtering, aggregation, and alerting capabilities
  • 6Data lake ingestion with format conversion and schema validation from various source systems
  • 7Real-time data replication and change data capture (CDC) processing for database synchronization

Prerequisites

  • Docker Engine 20.10+ and Docker Compose V2 for container orchestration support
  • Minimum 512MB RAM available for Benthos container, with additional memory for large message buffering
  • Port 4195 available on host system for Benthos HTTP API and monitoring endpoints
  • Valid benthos.yaml configuration file defining input sources, processors, and output destinations
  • Basic understanding of YAML syntax and stream processing concepts for pipeline configuration
  • Network connectivity to source and destination systems specified in your Benthos pipeline configuration

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 benthos:
3 image: jeffail/benthos:latest
4 container_name: benthos
5 restart: unless-stopped
6 command: -c /benthos.yaml
7 volumes:
8 - ./benthos.yaml:/benthos.yaml
9 ports:
10 - "4195:4195"

.env Template

.env
1# Create benthos.yaml config

Usage Notes

  1. 1Docs: https://www.benthos.dev/docs/
  2. 2HTTP API at http://localhost:4195, health at /ready
  3. 3Define pipelines in benthos.yaml: input -> processors -> output
  4. 460+ connectors: Kafka, AWS, GCP, databases, HTTP, files
  5. 5Bloblang for powerful data transformation
  6. 6Single binary, no external dependencies, very fast

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 benthos:
5 image: jeffail/benthos:latest
6 container_name: benthos
7 restart: unless-stopped
8 command: -c /benthos.yaml
9 volumes:
10 - ./benthos.yaml:/benthos.yaml
11 ports:
12 - "4195:4195"
13EOF
14
15# 2. Create the .env file
16cat > .env << 'EOF'
17# Create benthos.yaml config
18EOF
19
20# 3. Start the services
21docker compose up -d
22
23# 4. View logs
24docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/benthos/run | bash

Troubleshooting

  • Container exits with 'no such file or directory' error: Ensure benthos.yaml exists in the current directory with proper YAML syntax
  • HTTP API returns 404 on /ready endpoint: Benthos is still starting up or configuration failed validation - check container logs for parsing errors
  • Pipeline processing stops with 'connection refused' errors: Verify network connectivity to input/output systems and check firewall rules
  • High memory usage or OOM kills: Reduce batch sizes in configuration or increase Docker container memory limits for large message processing
  • Bloblang transformation errors in logs: Validate Bloblang expressions using Benthos online editor or local testing before deployment
  • Configuration hot-reload not working: Ensure benthos.yaml file has proper permissions and is not being edited by multiple processes simultaneously

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space