docker.recipes

Kafka Data Streaming Platform

advanced

Apache Kafka with Zookeeper, Schema Registry, Connect, and AKHQ management UI

Overview

Apache Kafka is a distributed event streaming platform originally developed at LinkedIn and open-sourced in 2011, designed to handle high-throughput, fault-tolerant streaming of real-time data feeds. Built as a distributed commit log, Kafka excels at ingesting millions of events per second while providing durable storage and replay capabilities that traditional message queues cannot match. Unlike conventional messaging systems, Kafka treats data as an immutable sequence of events, making it ideal for event sourcing, real-time analytics, and building reactive architectures. This comprehensive streaming platform combines Kafka with Apache ZooKeeper for distributed coordination, Confluent Schema Registry for data governance, Kafka Connect for seamless data integration, and AKHQ as a web-based management interface. Together, these components create a complete event-driven architecture capable of connecting databases, applications, and microservices through standardized data pipelines. This stack is particularly valuable for organizations implementing event-driven microservices, real-time analytics platforms, or change data capture (CDC) workflows where data consistency, replay capability, and horizontal scalability are critical requirements.

Key Features

  • High-throughput event processing capable of handling millions of messages per second with low latency
  • Durable message persistence with configurable retention policies and log compaction for event sourcing
  • Consumer groups enabling parallel processing and automatic load balancing across multiple consumers
  • Schema evolution and compatibility checking through Confluent Schema Registry with Avro support
  • Pre-configured Kafka Connect workers for building data pipelines between Kafka and external systems
  • AKHQ web interface providing topic management, consumer group monitoring, and message browsing
  • Exactly-once delivery semantics ensuring data integrity in mission-critical applications
  • Horizontal partitioning across brokers for linear scalability and fault tolerance

Common Use Cases

  • 1Real-time data pipeline orchestration between microservices, databases, and analytics platforms
  • 2Change Data Capture (CDC) implementation to stream database changes to downstream applications
  • 3Event sourcing architectures where business events need to be stored and replayed chronologically
  • 4Log aggregation from distributed applications and infrastructure components for centralized monitoring
  • 5IoT data ingestion platforms handling sensor data streams from thousands of connected devices
  • 6Financial trading systems requiring ultra-low latency order processing and market data distribution
  • 7E-commerce platforms tracking user behavior, inventory changes, and order fulfillment events in real-time

Prerequisites

  • Minimum 6GB available RAM (1GB for Kafka, 512MB for ZooKeeper, remaining for other components)
  • Docker Engine 20.10+ and Docker Compose V2 for proper container orchestration support
  • Available ports 8080 (AKHQ), 8081 (Schema Registry), 8083 (Connect), and 9092 (Kafka broker)
  • Basic understanding of distributed systems concepts and event-driven architecture patterns
  • Sufficient disk space for persistent volumes as Kafka stores all messages until retention period expires
  • Network configuration allowing inter-container communication on default Docker bridge network

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 zookeeper:
3 image: confluentinc/cp-zookeeper:latest
4 container_name: zookeeper
5 restart: unless-stopped
6 environment:
7 ZOOKEEPER_CLIENT_PORT: 2181
8 ZOOKEEPER_TICK_TIME: 2000
9 volumes:
10 - zookeeper_data:/var/lib/zookeeper/data
11 - zookeeper_logs:/var/lib/zookeeper/log
12
13 kafka:
14 image: confluentinc/cp-kafka:latest
15 container_name: kafka
16 restart: unless-stopped
17 ports:
18 - "${KAFKA_PORT:-9092}:9092"
19 environment:
20 KAFKA_BROKER_ID: 1
21 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
22 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:${KAFKA_PORT:-9092}
23 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
24 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
25 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
26 volumes:
27 - kafka_data:/var/lib/kafka/data
28 depends_on:
29 - zookeeper
30
31 schema-registry:
32 image: confluentinc/cp-schema-registry:latest
33 container_name: schema-registry
34 restart: unless-stopped
35 ports:
36 - "${SCHEMA_PORT:-8081}:8081"
37 environment:
38 SCHEMA_REGISTRY_HOST_NAME: schema-registry
39 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
40 depends_on:
41 - kafka
42
43 kafka-connect:
44 image: confluentinc/cp-kafka-connect:latest
45 container_name: kafka-connect
46 restart: unless-stopped
47 ports:
48 - "${CONNECT_PORT:-8083}:8083"
49 environment:
50 CONNECT_BOOTSTRAP_SERVERS: kafka:29092
51 CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
52 CONNECT_GROUP_ID: compose-connect-group
53 CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
54 CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
55 CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
56 CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
57 CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
58 CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
59 CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
60 CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
61 CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
62 CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
63 depends_on:
64 - kafka
65 - schema-registry
66
67 akhq:
68 image: tchiotludo/akhq:latest
69 container_name: akhq
70 restart: unless-stopped
71 ports:
72 - "${AKHQ_PORT:-8080}:8080"
73 environment:
74 AKHQ_CONFIGURATION: |
75 akhq:
76 connections:
77 docker-kafka-server:
78 properties:
79 bootstrap.servers: "kafka:29092"
80 schema-registry:
81 url: "http://schema-registry:8081"
82 connect:
83 - name: "connect"
84 url: "http://kafka-connect:8083"
85 depends_on:
86 - kafka
87 - schema-registry
88 - kafka-connect
89
90volumes:
91 zookeeper_data:
92 zookeeper_logs:
93 kafka_data:

.env Template

.env
1# Kafka Streaming Platform
2KAFKA_PORT=9092
3SCHEMA_PORT=8081
4CONNECT_PORT=8083
5AKHQ_PORT=8080

Usage Notes

  1. 1AKHQ management UI at http://localhost:8080
  2. 2Kafka accessible at localhost:9092
  3. 3Schema Registry at localhost:8081
  4. 4Kafka Connect REST API at localhost:8083
  5. 5Create topics via AKHQ or kafka-topics CLI
  6. 6Install connectors in kafka-connect for CDC, S3, etc.

Individual Services(5 services)

Copy individual services to mix and match with your existing compose files.

zookeeper
zookeeper:
  image: confluentinc/cp-zookeeper:latest
  container_name: zookeeper
  restart: unless-stopped
  environment:
    ZOOKEEPER_CLIENT_PORT: 2181
    ZOOKEEPER_TICK_TIME: 2000
  volumes:
    - zookeeper_data:/var/lib/zookeeper/data
    - zookeeper_logs:/var/lib/zookeeper/log
kafka
kafka:
  image: confluentinc/cp-kafka:latest
  container_name: kafka
  restart: unless-stopped
  ports:
    - ${KAFKA_PORT:-9092}:9092
  environment:
    KAFKA_BROKER_ID: 1
    KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:${KAFKA_PORT:-9092}
    KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
    KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
    KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
  volumes:
    - kafka_data:/var/lib/kafka/data
  depends_on:
    - zookeeper
schema-registry
schema-registry:
  image: confluentinc/cp-schema-registry:latest
  container_name: schema-registry
  restart: unless-stopped
  ports:
    - ${SCHEMA_PORT:-8081}:8081
  environment:
    SCHEMA_REGISTRY_HOST_NAME: schema-registry
    SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
  depends_on:
    - kafka
kafka-connect
kafka-connect:
  image: confluentinc/cp-kafka-connect:latest
  container_name: kafka-connect
  restart: unless-stopped
  ports:
    - ${CONNECT_PORT:-8083}:8083
  environment:
    CONNECT_BOOTSTRAP_SERVERS: kafka:29092
    CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
    CONNECT_GROUP_ID: compose-connect-group
    CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
    CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
    CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
    CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
    CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
    CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
    CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
    CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
    CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
  depends_on:
    - kafka
    - schema-registry
akhq
akhq:
  image: tchiotludo/akhq:latest
  container_name: akhq
  restart: unless-stopped
  ports:
    - ${AKHQ_PORT:-8080}:8080
  environment:
    AKHQ_CONFIGURATION: |
      akhq:
        connections:
          docker-kafka-server:
            properties:
              bootstrap.servers: "kafka:29092"
            schema-registry:
              url: "http://schema-registry:8081"
            connect:
              - name: "connect"
                url: "http://kafka-connect:8083"
  depends_on:
    - kafka
    - schema-registry
    - kafka-connect

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 zookeeper:
5 image: confluentinc/cp-zookeeper:latest
6 container_name: zookeeper
7 restart: unless-stopped
8 environment:
9 ZOOKEEPER_CLIENT_PORT: 2181
10 ZOOKEEPER_TICK_TIME: 2000
11 volumes:
12 - zookeeper_data:/var/lib/zookeeper/data
13 - zookeeper_logs:/var/lib/zookeeper/log
14
15 kafka:
16 image: confluentinc/cp-kafka:latest
17 container_name: kafka
18 restart: unless-stopped
19 ports:
20 - "${KAFKA_PORT:-9092}:9092"
21 environment:
22 KAFKA_BROKER_ID: 1
23 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
24 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:${KAFKA_PORT:-9092}
25 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
26 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
27 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
28 volumes:
29 - kafka_data:/var/lib/kafka/data
30 depends_on:
31 - zookeeper
32
33 schema-registry:
34 image: confluentinc/cp-schema-registry:latest
35 container_name: schema-registry
36 restart: unless-stopped
37 ports:
38 - "${SCHEMA_PORT:-8081}:8081"
39 environment:
40 SCHEMA_REGISTRY_HOST_NAME: schema-registry
41 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
42 depends_on:
43 - kafka
44
45 kafka-connect:
46 image: confluentinc/cp-kafka-connect:latest
47 container_name: kafka-connect
48 restart: unless-stopped
49 ports:
50 - "${CONNECT_PORT:-8083}:8083"
51 environment:
52 CONNECT_BOOTSTRAP_SERVERS: kafka:29092
53 CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
54 CONNECT_GROUP_ID: compose-connect-group
55 CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
56 CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
57 CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
58 CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
59 CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
60 CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
61 CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
62 CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
63 CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
64 CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
65 depends_on:
66 - kafka
67 - schema-registry
68
69 akhq:
70 image: tchiotludo/akhq:latest
71 container_name: akhq
72 restart: unless-stopped
73 ports:
74 - "${AKHQ_PORT:-8080}:8080"
75 environment:
76 AKHQ_CONFIGURATION: |
77 akhq:
78 connections:
79 docker-kafka-server:
80 properties:
81 bootstrap.servers: "kafka:29092"
82 schema-registry:
83 url: "http://schema-registry:8081"
84 connect:
85 - name: "connect"
86 url: "http://kafka-connect:8083"
87 depends_on:
88 - kafka
89 - schema-registry
90 - kafka-connect
91
92volumes:
93 zookeeper_data:
94 zookeeper_logs:
95 kafka_data:
96EOF
97
98# 2. Create the .env file
99cat > .env << 'EOF'
100# Kafka Streaming Platform
101KAFKA_PORT=9092
102SCHEMA_PORT=8081
103CONNECT_PORT=8083
104AKHQ_PORT=8080
105EOF
106
107# 3. Start the services
108docker compose up -d
109
110# 4. View logs
111docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/kafka-streaming-platform/run | bash

Troubleshooting

  • Error 'Broker may not be available': Verify ZooKeeper is running and accessible before starting Kafka broker
  • Schema Registry connection refused: Ensure Kafka broker is fully started before Schema Registry attempts connection
  • AKHQ shows empty cluster: Check that AKHQ_CONFIGURATION environment variable is properly formatted YAML
  • Kafka Connect worker fails to start: Verify internal topics (configs, offsets, status) are created with correct replication factor
  • OutOfMemoryError in Kafka container: Increase Docker memory limits or add KAFKA_HEAP_OPTS environment variable
  • Producer/Consumer timeout errors: Adjust KAFKA_ADVERTISED_LISTENERS to match your network configuration for external access

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Components

kafkazookeeperschema-registrykafka-connectakhq

Tags

#kafka#streaming#data-pipeline#zookeeper#schema-registry#event-driven

Category

Message Queues & Brokers
Ad Space