docker.recipes

Apache Kafka Full Stack

advanced

Complete Kafka setup with Zookeeper, Schema Registry, Connect, and KSQL.

Overview

Apache ZooKeeper serves as the distributed coordination backbone for Apache Kafka, providing essential services like configuration management, leader election, and cluster membership through its hierarchical namespace and strong consistency guarantees. Originally developed at Yahoo for managing distributed systems, ZooKeeper has become the de facto standard for coordinating Kafka brokers, tracking partition assignments, and managing consumer group offsets. This comprehensive Kafka ecosystem combines ZooKeeper's coordination services with Kafka's high-throughput streaming platform, Confluent Schema Registry for Avro schema evolution, and ksqlDB for real-time stream processing and analytics. The stack creates a complete event streaming architecture where ZooKeeper maintains cluster state, Kafka handles message durability and partitioning, Schema Registry ensures data compatibility across services, and ksqlDB enables SQL-based stream transformations. Organizations building event-driven architectures, real-time analytics platforms, or microservices requiring reliable message ordering will find this stack invaluable for handling millions of events per second while maintaining exactly-once processing semantics. The combination particularly excels in scenarios requiring both high-throughput data ingestion and complex stream processing, making it ideal for financial trading systems, IoT data pipelines, and large-scale log aggregation where data consistency and replay capabilities are critical.

Key Features

  • Hierarchical namespace in ZooKeeper for organized configuration and service coordination
  • High-throughput message streaming with Kafka's distributed log architecture supporting millions of events per second
  • Avro schema evolution and compatibility checking through Confluent Schema Registry
  • SQL-based stream processing with ksqlDB for real-time analytics and transformations
  • Consumer group coordination enabling parallel processing across multiple application instances
  • Exactly-once processing semantics for mission-critical data pipelines
  • Log compaction and retention policies for efficient storage and event sourcing patterns
  • Distributed partitioning strategy for horizontal scaling and fault tolerance

Common Use Cases

  • 1Financial trading platforms requiring microsecond message ordering and replay capabilities
  • 2IoT data ingestion pipelines processing sensor data from thousands of connected devices
  • 3E-commerce platforms implementing event sourcing for order management and inventory tracking
  • 4Real-time fraud detection systems analyzing transaction streams with complex event patterns
  • 5Microservices architectures using event-driven communication for service decoupling
  • 6Log aggregation and monitoring systems collecting application logs from distributed services
  • 7Change data capture (CDC) implementations for database synchronization and data lake population

Prerequisites

  • Minimum 6GB RAM available (1GB for Kafka, 1GB for ZooKeeper, 2GB for ksqlDB, plus overhead)
  • Docker Engine 20.10+ and Docker Compose v2 for proper networking and volume management
  • Available ports 9092 (Kafka), 8081 (Schema Registry), and 8088 (ksqlDB) on the host system
  • Basic understanding of Apache Kafka concepts including topics, partitions, and consumer groups
  • Knowledge of Avro schema design principles for effective Schema Registry utilization
  • Familiarity with SQL syntax for ksqlDB stream processing queries and transformations

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 zookeeper:
3 image: confluentinc/cp-zookeeper:latest
4 container_name: zookeeper
5 restart: unless-stopped
6 environment:
7 ZOOKEEPER_CLIENT_PORT: 2181
8 ZOOKEEPER_TICK_TIME: 2000
9 volumes:
10 - zk_data:/var/lib/zookeeper/data
11 - zk_logs:/var/lib/zookeeper/log
12 networks:
13 - kafka-network
14
15 kafka:
16 image: confluentinc/cp-kafka:latest
17 container_name: kafka
18 restart: unless-stopped
19 depends_on:
20 - zookeeper
21 ports:
22 - "${KAFKA_PORT:-9092}:9092"
23 environment:
24 KAFKA_BROKER_ID: 1
25 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
26 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
27 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
28 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
29 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
30 volumes:
31 - kafka_data:/var/lib/kafka/data
32 networks:
33 - kafka-network
34
35 schema-registry:
36 image: confluentinc/cp-schema-registry:latest
37 container_name: schema-registry
38 restart: unless-stopped
39 depends_on:
40 - kafka
41 ports:
42 - "${SCHEMA_REGISTRY_PORT:-8081}:8081"
43 environment:
44 SCHEMA_REGISTRY_HOST_NAME: schema-registry
45 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
46 networks:
47 - kafka-network
48
49 ksqldb:
50 image: confluentinc/ksqldb-server:latest
51 container_name: ksqldb
52 restart: unless-stopped
53 depends_on:
54 - kafka
55 - schema-registry
56 ports:
57 - "${KSQL_PORT:-8088}:8088"
58 environment:
59 KSQL_BOOTSTRAP_SERVERS: kafka:29092
60 KSQL_LISTENERS: http://0.0.0.0:8088
61 KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
62 networks:
63 - kafka-network
64
65volumes:
66 zk_data:
67 zk_logs:
68 kafka_data:
69
70networks:
71 kafka-network:
72 driver: bridge

.env Template

.env
1# Kafka Full Stack
2KAFKA_PORT=9092
3SCHEMA_REGISTRY_PORT=8081
4KSQL_PORT=8088

Usage Notes

  1. 1Kafka broker at localhost:9092
  2. 2Schema Registry at http://localhost:8081
  3. 3KSQL at http://localhost:8088
  4. 4Use kafka-topics CLI to manage topics

Individual Services(4 services)

Copy individual services to mix and match with your existing compose files.

zookeeper
zookeeper:
  image: confluentinc/cp-zookeeper:latest
  container_name: zookeeper
  restart: unless-stopped
  environment:
    ZOOKEEPER_CLIENT_PORT: 2181
    ZOOKEEPER_TICK_TIME: 2000
  volumes:
    - zk_data:/var/lib/zookeeper/data
    - zk_logs:/var/lib/zookeeper/log
  networks:
    - kafka-network
kafka
kafka:
  image: confluentinc/cp-kafka:latest
  container_name: kafka
  restart: unless-stopped
  depends_on:
    - zookeeper
  ports:
    - ${KAFKA_PORT:-9092}:9092
  environment:
    KAFKA_BROKER_ID: 1
    KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
    KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
    KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
    KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
  volumes:
    - kafka_data:/var/lib/kafka/data
  networks:
    - kafka-network
schema-registry
schema-registry:
  image: confluentinc/cp-schema-registry:latest
  container_name: schema-registry
  restart: unless-stopped
  depends_on:
    - kafka
  ports:
    - ${SCHEMA_REGISTRY_PORT:-8081}:8081
  environment:
    SCHEMA_REGISTRY_HOST_NAME: schema-registry
    SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
  networks:
    - kafka-network
ksqldb
ksqldb:
  image: confluentinc/ksqldb-server:latest
  container_name: ksqldb
  restart: unless-stopped
  depends_on:
    - kafka
    - schema-registry
  ports:
    - ${KSQL_PORT:-8088}:8088
  environment:
    KSQL_BOOTSTRAP_SERVERS: kafka:29092
    KSQL_LISTENERS: http://0.0.0.0:8088
    KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
  networks:
    - kafka-network

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 zookeeper:
5 image: confluentinc/cp-zookeeper:latest
6 container_name: zookeeper
7 restart: unless-stopped
8 environment:
9 ZOOKEEPER_CLIENT_PORT: 2181
10 ZOOKEEPER_TICK_TIME: 2000
11 volumes:
12 - zk_data:/var/lib/zookeeper/data
13 - zk_logs:/var/lib/zookeeper/log
14 networks:
15 - kafka-network
16
17 kafka:
18 image: confluentinc/cp-kafka:latest
19 container_name: kafka
20 restart: unless-stopped
21 depends_on:
22 - zookeeper
23 ports:
24 - "${KAFKA_PORT:-9092}:9092"
25 environment:
26 KAFKA_BROKER_ID: 1
27 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
28 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
29 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
30 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
31 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
32 volumes:
33 - kafka_data:/var/lib/kafka/data
34 networks:
35 - kafka-network
36
37 schema-registry:
38 image: confluentinc/cp-schema-registry:latest
39 container_name: schema-registry
40 restart: unless-stopped
41 depends_on:
42 - kafka
43 ports:
44 - "${SCHEMA_REGISTRY_PORT:-8081}:8081"
45 environment:
46 SCHEMA_REGISTRY_HOST_NAME: schema-registry
47 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
48 networks:
49 - kafka-network
50
51 ksqldb:
52 image: confluentinc/ksqldb-server:latest
53 container_name: ksqldb
54 restart: unless-stopped
55 depends_on:
56 - kafka
57 - schema-registry
58 ports:
59 - "${KSQL_PORT:-8088}:8088"
60 environment:
61 KSQL_BOOTSTRAP_SERVERS: kafka:29092
62 KSQL_LISTENERS: http://0.0.0.0:8088
63 KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
64 networks:
65 - kafka-network
66
67volumes:
68 zk_data:
69 zk_logs:
70 kafka_data:
71
72networks:
73 kafka-network:
74 driver: bridge
75EOF
76
77# 2. Create the .env file
78cat > .env << 'EOF'
79# Kafka Full Stack
80KAFKA_PORT=9092
81SCHEMA_REGISTRY_PORT=8081
82KSQL_PORT=8088
83EOF
84
85# 3. Start the services
86docker compose up -d
87
88# 4. View logs
89docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/apache-kafka-full-stack/run | bash

Troubleshooting

  • ZooKeeper connection errors 'java.net.ConnectException': Ensure ZooKeeper container is fully started before Kafka attempts connection, check container logs for port binding issues
  • Kafka 'Not enough replicas' warnings: Set KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR to 1 for single-broker development setups, increase for production clusters
  • Schema Registry 'Schema not found' errors: Verify schema compatibility settings and ensure subjects are properly registered before producing/consuming with schemas
  • ksqlDB 'Topic does not exist' failures: Create Kafka topics explicitly using kafka-topics CLI before referencing them in ksqlDB streams or tables
  • OutOfMemory errors in Kafka: Increase container memory limits and tune JVM heap settings using KAFKA_HEAP_OPTS environment variable
  • Consumer lag issues: Monitor partition distribution and increase consumer instances within groups, check for hot partitions causing processing bottlenecks

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space