Apache Kafka Full Stack
Complete Kafka setup with Zookeeper, Schema Registry, Connect, and KSQL.
Overview
Apache ZooKeeper serves as the distributed coordination backbone for Apache Kafka, providing essential services like configuration management, leader election, and cluster membership through its hierarchical namespace and strong consistency guarantees. Originally developed at Yahoo for managing distributed systems, ZooKeeper has become the de facto standard for coordinating Kafka brokers, tracking partition assignments, and managing consumer group offsets. This comprehensive Kafka ecosystem combines ZooKeeper's coordination services with Kafka's high-throughput streaming platform, Confluent Schema Registry for Avro schema evolution, and ksqlDB for real-time stream processing and analytics. The stack creates a complete event streaming architecture where ZooKeeper maintains cluster state, Kafka handles message durability and partitioning, Schema Registry ensures data compatibility across services, and ksqlDB enables SQL-based stream transformations. Organizations building event-driven architectures, real-time analytics platforms, or microservices requiring reliable message ordering will find this stack invaluable for handling millions of events per second while maintaining exactly-once processing semantics. The combination particularly excels in scenarios requiring both high-throughput data ingestion and complex stream processing, making it ideal for financial trading systems, IoT data pipelines, and large-scale log aggregation where data consistency and replay capabilities are critical.
Key Features
- Hierarchical namespace in ZooKeeper for organized configuration and service coordination
- High-throughput message streaming with Kafka's distributed log architecture supporting millions of events per second
- Avro schema evolution and compatibility checking through Confluent Schema Registry
- SQL-based stream processing with ksqlDB for real-time analytics and transformations
- Consumer group coordination enabling parallel processing across multiple application instances
- Exactly-once processing semantics for mission-critical data pipelines
- Log compaction and retention policies for efficient storage and event sourcing patterns
- Distributed partitioning strategy for horizontal scaling and fault tolerance
Common Use Cases
- 1Financial trading platforms requiring microsecond message ordering and replay capabilities
- 2IoT data ingestion pipelines processing sensor data from thousands of connected devices
- 3E-commerce platforms implementing event sourcing for order management and inventory tracking
- 4Real-time fraud detection systems analyzing transaction streams with complex event patterns
- 5Microservices architectures using event-driven communication for service decoupling
- 6Log aggregation and monitoring systems collecting application logs from distributed services
- 7Change data capture (CDC) implementations for database synchronization and data lake population
Prerequisites
- Minimum 6GB RAM available (1GB for Kafka, 1GB for ZooKeeper, 2GB for ksqlDB, plus overhead)
- Docker Engine 20.10+ and Docker Compose v2 for proper networking and volume management
- Available ports 9092 (Kafka), 8081 (Schema Registry), and 8088 (ksqlDB) on the host system
- Basic understanding of Apache Kafka concepts including topics, partitions, and consumer groups
- Knowledge of Avro schema design principles for effective Schema Registry utilization
- Familiarity with SQL syntax for ksqlDB stream processing queries and transformations
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 zookeeper: 3 image: confluentinc/cp-zookeeper:latest4 container_name: zookeeper5 restart: unless-stopped6 environment: 7 ZOOKEEPER_CLIENT_PORT: 21818 ZOOKEEPER_TICK_TIME: 20009 volumes: 10 - zk_data:/var/lib/zookeeper/data11 - zk_logs:/var/lib/zookeeper/log12 networks: 13 - kafka-network1415 kafka: 16 image: confluentinc/cp-kafka:latest17 container_name: kafka18 restart: unless-stopped19 depends_on: 20 - zookeeper21 ports: 22 - "${KAFKA_PORT:-9092}:9092"23 environment: 24 KAFKA_BROKER_ID: 125 KAFKA_ZOOKEEPER_CONNECT: zookeeper:218126 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:909227 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT28 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT29 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 130 volumes: 31 - kafka_data:/var/lib/kafka/data32 networks: 33 - kafka-network3435 schema-registry: 36 image: confluentinc/cp-schema-registry:latest37 container_name: schema-registry38 restart: unless-stopped39 depends_on: 40 - kafka41 ports: 42 - "${SCHEMA_REGISTRY_PORT:-8081}:8081"43 environment: 44 SCHEMA_REGISTRY_HOST_NAME: schema-registry45 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:2909246 networks: 47 - kafka-network4849 ksqldb: 50 image: confluentinc/ksqldb-server:latest51 container_name: ksqldb52 restart: unless-stopped53 depends_on: 54 - kafka55 - schema-registry56 ports: 57 - "${KSQL_PORT:-8088}:8088"58 environment: 59 KSQL_BOOTSTRAP_SERVERS: kafka:2909260 KSQL_LISTENERS: http://0.0.0.0:808861 KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:808162 networks: 63 - kafka-network6465volumes: 66 zk_data: 67 zk_logs: 68 kafka_data: 6970networks: 71 kafka-network: 72 driver: bridge.env Template
.env
1# Kafka Full Stack2KAFKA_PORT=90923SCHEMA_REGISTRY_PORT=80814KSQL_PORT=8088Usage Notes
- 1Kafka broker at localhost:9092
- 2Schema Registry at http://localhost:8081
- 3KSQL at http://localhost:8088
- 4Use kafka-topics CLI to manage topics
Individual Services(4 services)
Copy individual services to mix and match with your existing compose files.
zookeeper
zookeeper:
image: confluentinc/cp-zookeeper:latest
container_name: zookeeper
restart: unless-stopped
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
volumes:
- zk_data:/var/lib/zookeeper/data
- zk_logs:/var/lib/zookeeper/log
networks:
- kafka-network
kafka
kafka:
image: confluentinc/cp-kafka:latest
container_name: kafka
restart: unless-stopped
depends_on:
- zookeeper
ports:
- ${KAFKA_PORT:-9092}:9092
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
volumes:
- kafka_data:/var/lib/kafka/data
networks:
- kafka-network
schema-registry
schema-registry:
image: confluentinc/cp-schema-registry:latest
container_name: schema-registry
restart: unless-stopped
depends_on:
- kafka
ports:
- ${SCHEMA_REGISTRY_PORT:-8081}:8081
environment:
SCHEMA_REGISTRY_HOST_NAME: schema-registry
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:29092
networks:
- kafka-network
ksqldb
ksqldb:
image: confluentinc/ksqldb-server:latest
container_name: ksqldb
restart: unless-stopped
depends_on:
- kafka
- schema-registry
ports:
- ${KSQL_PORT:-8088}:8088
environment:
KSQL_BOOTSTRAP_SERVERS: kafka:29092
KSQL_LISTENERS: http://0.0.0.0:8088
KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
networks:
- kafka-network
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 zookeeper:5 image: confluentinc/cp-zookeeper:latest6 container_name: zookeeper7 restart: unless-stopped8 environment:9 ZOOKEEPER_CLIENT_PORT: 218110 ZOOKEEPER_TICK_TIME: 200011 volumes:12 - zk_data:/var/lib/zookeeper/data13 - zk_logs:/var/lib/zookeeper/log14 networks:15 - kafka-network1617 kafka:18 image: confluentinc/cp-kafka:latest19 container_name: kafka20 restart: unless-stopped21 depends_on:22 - zookeeper23 ports:24 - "${KAFKA_PORT:-9092}:9092"25 environment:26 KAFKA_BROKER_ID: 127 KAFKA_ZOOKEEPER_CONNECT: zookeeper:218128 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:909229 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT30 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT31 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 132 volumes:33 - kafka_data:/var/lib/kafka/data34 networks:35 - kafka-network3637 schema-registry:38 image: confluentinc/cp-schema-registry:latest39 container_name: schema-registry40 restart: unless-stopped41 depends_on:42 - kafka43 ports:44 - "${SCHEMA_REGISTRY_PORT:-8081}:8081"45 environment:46 SCHEMA_REGISTRY_HOST_NAME: schema-registry47 SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka:2909248 networks:49 - kafka-network5051 ksqldb:52 image: confluentinc/ksqldb-server:latest53 container_name: ksqldb54 restart: unless-stopped55 depends_on:56 - kafka57 - schema-registry58 ports:59 - "${KSQL_PORT:-8088}:8088"60 environment:61 KSQL_BOOTSTRAP_SERVERS: kafka:2909262 KSQL_LISTENERS: http://0.0.0.0:808863 KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:808164 networks:65 - kafka-network6667volumes:68 zk_data:69 zk_logs:70 kafka_data:7172networks:73 kafka-network:74 driver: bridge75EOF7677# 2. Create the .env file78cat > .env << 'EOF'79# Kafka Full Stack80KAFKA_PORT=909281SCHEMA_REGISTRY_PORT=808182KSQL_PORT=808883EOF8485# 3. Start the services86docker compose up -d8788# 4. View logs89docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/apache-kafka-full-stack/run | bashTroubleshooting
- ZooKeeper connection errors 'java.net.ConnectException': Ensure ZooKeeper container is fully started before Kafka attempts connection, check container logs for port binding issues
- Kafka 'Not enough replicas' warnings: Set KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR to 1 for single-broker development setups, increase for production clusters
- Schema Registry 'Schema not found' errors: Verify schema compatibility settings and ensure subjects are properly registered before producing/consuming with schemas
- ksqlDB 'Topic does not exist' failures: Create Kafka topics explicitly using kafka-topics CLI before referencing them in ksqlDB streams or tables
- OutOfMemory errors in Kafka: Increase container memory limits and tune JVM heap settings using KAFKA_HEAP_OPTS environment variable
- Consumer lag issues: Monitor partition distribution and increase consumer instances within groups, check for hot partitions causing processing bottlenecks
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
zookeeperkafkaschema-registrykafka-connectksqldb
Tags
#kafka#streaming#event-driven#zookeeper#ksql
Category
Message Queues & BrokersAd Space
Shortcuts: C CopyF FavoriteD Download