Kafka Data Streaming Platform
Apache Kafka with Zookeeper, Schema Registry, Connect, and AKHQ management UI
Overview
Apache Kafka is a distributed event streaming platform originally developed at LinkedIn and open-sourced in 2011, designed to handle high-throughput, fault-tolerant streaming of real-time data feeds. Built as a distributed commit log, Kafka excels at ingesting millions of events per second while providing durable storage and replay capabilities that traditional message queues cannot match. Unlike conventional messaging systems, Kafka treats data as an immutable sequence of events, making it ideal for event sourcing, real-time analytics, and building reactive architectures. This comprehensive streaming platform combines Kafka with Apache ZooKeeper for distributed coordination, Confluent Schema Registry for data governance, Kafka Connect for seamless data integration, and AKHQ as a web-based management interface. Together, these components create a complete event-driven architecture capable of connecting databases, applications, and microservices through standardized data pipelines. This stack is particularly valuable for organizations implementing event-driven microservices, real-time analytics platforms, or change data capture (CDC) workflows where data consistency, replay capability, and horizontal scalability are critical requirements.
Key Features
- High-throughput event processing capable of handling millions of messages per second with low latency
- Durable message persistence with configurable retention policies and log compaction for event sourcing
- Consumer groups enabling parallel processing and automatic load balancing across multiple consumers
- Schema evolution and compatibility checking through Confluent Schema Registry with Avro support
- Pre-configured Kafka Connect workers for building data pipelines between Kafka and external systems
- AKHQ web interface providing topic management, consumer group monitoring, and message browsing
- Exactly-once delivery semantics ensuring data integrity in mission-critical applications
- Horizontal partitioning across brokers for linear scalability and fault tolerance
Common Use Cases
- 1Real-time data pipeline orchestration between microservices, databases, and analytics platforms
- 2Change Data Capture (CDC) implementation to stream database changes to downstream applications
- 3Event sourcing architectures where business events need to be stored and replayed chronologically
- 4Log aggregation from distributed applications and infrastructure components for centralized monitoring
- 5IoT data ingestion platforms handling sensor data streams from thousands of connected devices
- 6Financial trading systems requiring ultra-low latency order processing and market data distribution
- 7E-commerce platforms tracking user behavior, inventory changes, and order fulfillment events in real-time
Prerequisites
- Minimum 6GB available RAM (1GB for Kafka, 512MB for ZooKeeper, remaining for other components)
- Docker Engine 20.10+ and Docker Compose V2 for proper container orchestration support
- Available ports 8080 (AKHQ), 8081 (Schema Registry), 8083 (Connect), and 9092 (Kafka broker)
- Basic understanding of distributed systems concepts and event-driven architecture patterns
- Sufficient disk space for persistent volumes as Kafka stores all messages until retention period expires
- Network configuration allowing inter-container communication on default Docker bridge network
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 zookeeper: 3 image: confluentinc/cp-zookeeper:latest4 container_name: zookeeper5 restart: unless-stopped6 environment: 7 ZOOKEEPER_CLIENT_PORT: 21818 ZOOKEEPER_TICK_TIME: 20009 volumes: 10 - zookeeper_data:/var/lib/zookeeper/data11 - zookeeper_logs:/var/lib/zookeeper/log1213 kafka: 14 image: confluentinc/cp-kafka:latest15 container_name: kafka16 restart: unless-stopped17 ports: 18 - "${KAFKA_PORT:-9092}:9092"19 environment: 20 KAFKA_BROKER_ID: 121 KAFKA_ZOOKEEPER_CONNECT: zookeeper:218122 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:${KAFKA_PORT:-9092}23 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT24 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT25 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 126 volumes: 27 - kafka_data:/var/lib/kafka/data28 depends_on: 29 - zookeeper3031 schema-registry: 32 image: confluentinc/cp-schema-registry:latest33 container_name: schema-registry34 restart: unless-stopped35 ports: 36 - "${SCHEMA_PORT:-8081}:8081"37 environment: 38 SCHEMA_REGISTRY_HOST_NAME: schema-registry39 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:2909240 depends_on: 41 - kafka4243 kafka-connect: 44 image: confluentinc/cp-kafka-connect:latest45 container_name: kafka-connect46 restart: unless-stopped47 ports: 48 - "${CONNECT_PORT:-8083}:8083"49 environment: 50 CONNECT_BOOTSTRAP_SERVERS: kafka:2909251 CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect52 CONNECT_GROUP_ID: compose-connect-group53 CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs54 CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets55 CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status56 CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter57 CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter58 CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:808159 CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:808160 CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 161 CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 162 CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 163 depends_on: 64 - kafka65 - schema-registry6667 akhq: 68 image: tchiotludo/akhq:latest69 container_name: akhq70 restart: unless-stopped71 ports: 72 - "${AKHQ_PORT:-8080}:8080"73 environment: 74 AKHQ_CONFIGURATION: |75 akhq: 76 connections: 77 docker-kafka-server: 78 properties: 79 bootstrap.servers: "kafka:29092"80 schema-registry: 81 url: "http://schema-registry:8081"82 connect: 83 - name: "connect"84 url: "http://kafka-connect:8083"85 depends_on: 86 - kafka87 - schema-registry88 - kafka-connect8990volumes: 91 zookeeper_data: 92 zookeeper_logs: 93 kafka_data: .env Template
.env
1# Kafka Streaming Platform2KAFKA_PORT=90923SCHEMA_PORT=80814CONNECT_PORT=80835AKHQ_PORT=8080Usage Notes
- 1AKHQ management UI at http://localhost:8080
- 2Kafka accessible at localhost:9092
- 3Schema Registry at localhost:8081
- 4Kafka Connect REST API at localhost:8083
- 5Create topics via AKHQ or kafka-topics CLI
- 6Install connectors in kafka-connect for CDC, S3, etc.
Individual Services(5 services)
Copy individual services to mix and match with your existing compose files.
zookeeper
zookeeper:
image: confluentinc/cp-zookeeper:latest
container_name: zookeeper
restart: unless-stopped
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
volumes:
- zookeeper_data:/var/lib/zookeeper/data
- zookeeper_logs:/var/lib/zookeeper/log
kafka
kafka:
image: confluentinc/cp-kafka:latest
container_name: kafka
restart: unless-stopped
ports:
- ${KAFKA_PORT:-9092}:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:${KAFKA_PORT:-9092}
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
volumes:
- kafka_data:/var/lib/kafka/data
depends_on:
- zookeeper
schema-registry
schema-registry:
image: confluentinc/cp-schema-registry:latest
container_name: schema-registry
restart: unless-stopped
ports:
- ${SCHEMA_PORT:-8081}:8081
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
depends_on:
- kafka
kafka-connect
kafka-connect:
image: confluentinc/cp-kafka-connect:latest
container_name: kafka-connect
restart: unless-stopped
ports:
- ${CONNECT_PORT:-8083}:8083
environment:
CONNECT_BOOTSTRAP_SERVERS: kafka:29092
CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect
CONNECT_GROUP_ID: compose-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
depends_on:
- kafka
- schema-registry
akhq
akhq:
image: tchiotludo/akhq:latest
container_name: akhq
restart: unless-stopped
ports:
- ${AKHQ_PORT:-8080}:8080
environment:
AKHQ_CONFIGURATION: |
akhq:
connections:
docker-kafka-server:
properties:
bootstrap.servers: "kafka:29092"
schema-registry:
url: "http://schema-registry:8081"
connect:
- name: "connect"
url: "http://kafka-connect:8083"
depends_on:
- kafka
- schema-registry
- kafka-connect
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 zookeeper:5 image: confluentinc/cp-zookeeper:latest6 container_name: zookeeper7 restart: unless-stopped8 environment:9 ZOOKEEPER_CLIENT_PORT: 218110 ZOOKEEPER_TICK_TIME: 200011 volumes:12 - zookeeper_data:/var/lib/zookeeper/data13 - zookeeper_logs:/var/lib/zookeeper/log1415 kafka:16 image: confluentinc/cp-kafka:latest17 container_name: kafka18 restart: unless-stopped19 ports:20 - "${KAFKA_PORT:-9092}:9092"21 environment:22 KAFKA_BROKER_ID: 123 KAFKA_ZOOKEEPER_CONNECT: zookeeper:218124 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:${KAFKA_PORT:-9092}25 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT26 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT27 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 128 volumes:29 - kafka_data:/var/lib/kafka/data30 depends_on:31 - zookeeper3233 schema-registry:34 image: confluentinc/cp-schema-registry:latest35 container_name: schema-registry36 restart: unless-stopped37 ports:38 - "${SCHEMA_PORT:-8081}:8081"39 environment:40 SCHEMA_REGISTRY_HOST_NAME: schema-registry41 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:2909242 depends_on:43 - kafka4445 kafka-connect:46 image: confluentinc/cp-kafka-connect:latest47 container_name: kafka-connect48 restart: unless-stopped49 ports:50 - "${CONNECT_PORT:-8083}:8083"51 environment:52 CONNECT_BOOTSTRAP_SERVERS: kafka:2909253 CONNECT_REST_ADVERTISED_HOST_NAME: kafka-connect54 CONNECT_GROUP_ID: compose-connect-group55 CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs56 CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets57 CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status58 CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter59 CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter60 CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:808161 CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:808162 CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 163 CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 164 CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 165 depends_on:66 - kafka67 - schema-registry6869 akhq:70 image: tchiotludo/akhq:latest71 container_name: akhq72 restart: unless-stopped73 ports:74 - "${AKHQ_PORT:-8080}:8080"75 environment:76 AKHQ_CONFIGURATION: |77 akhq:78 connections:79 docker-kafka-server:80 properties:81 bootstrap.servers: "kafka:29092"82 schema-registry:83 url: "http://schema-registry:8081"84 connect:85 - name: "connect"86 url: "http://kafka-connect:8083"87 depends_on:88 - kafka89 - schema-registry90 - kafka-connect9192volumes:93 zookeeper_data:94 zookeeper_logs:95 kafka_data:96EOF9798# 2. Create the .env file99cat > .env << 'EOF'100# Kafka Streaming Platform101KAFKA_PORT=9092102SCHEMA_PORT=8081103CONNECT_PORT=8083104AKHQ_PORT=8080105EOF106107# 3. Start the services108docker compose up -d109110# 4. View logs111docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/kafka-streaming-platform/run | bashTroubleshooting
- Error 'Broker may not be available': Verify ZooKeeper is running and accessible before starting Kafka broker
- Schema Registry connection refused: Ensure Kafka broker is fully started before Schema Registry attempts connection
- AKHQ shows empty cluster: Check that AKHQ_CONFIGURATION environment variable is properly formatted YAML
- Kafka Connect worker fails to start: Verify internal topics (configs, offsets, status) are created with correct replication factor
- OutOfMemoryError in Kafka container: Increase Docker memory limits or add KAFKA_HEAP_OPTS environment variable
- Producer/Consumer timeout errors: Adjust KAFKA_ADVERTISED_LISTENERS to match your network configuration for external access
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
kafkazookeeperschema-registrykafka-connectakhq
Tags
#kafka#streaming#data-pipeline#zookeeper#schema-registry#event-driven
Category
Message Queues & BrokersAd Space
Shortcuts: C CopyF FavoriteD Download