docker.recipes

Cassandra Cluster

advanced

Apache Cassandra distributed NoSQL database cluster setup.

Overview

Apache Cassandra is a highly scalable, distributed NoSQL database originally developed by Facebook and later open-sourced to the Apache Software Foundation. Designed to handle massive amounts of data across multiple commodity servers, Cassandra eliminates single points of failure through its peer-to-peer distributed architecture and provides tunable consistency levels for optimal performance across different use cases. This three-node Cassandra cluster configuration creates a production-grade distributed database setup where each node operates as an equal peer in the ring topology. The cluster uses cassandra-1 as the seed node to bootstrap the ring formation, with all nodes sharing the same cluster name and automatically discovering each other through Cassandra's gossip protocol. The setup enables linear scalability, automatic data distribution, and fault tolerance with configurable replication factors. This stack is ideal for organizations running write-heavy applications, time-series data collection systems, or any application requiring guaranteed uptime with global distribution capabilities. Companies dealing with IoT sensor data, financial transactions, social media platforms, or recommendation engines will benefit from Cassandra's ability to handle millions of writes per second while maintaining low-latency reads across geographically distributed locations.

Key Features

  • Linear horizontal scalability with no performance degradation
  • Masterless architecture with no single point of failure
  • Tunable consistency levels from eventual to strong consistency
  • Multi-datacenter replication with rack-aware topology
  • CQL (Cassandra Query Language) support for SQL-like operations
  • Time-series data optimization with TTL and compaction strategies
  • Wide column store model supporting complex data structures
  • Gossip protocol for automatic node discovery and health monitoring

Common Use Cases

  • 1IoT sensor data collection and time-series analytics platforms
  • 2Financial transaction processing systems requiring high availability
  • 3Social media platforms handling millions of user interactions
  • 4E-commerce recommendation engines with real-time personalization
  • 5Gaming leaderboards and player statistics tracking
  • 6Log aggregation and monitoring systems for distributed applications
  • 7Content delivery networks requiring global data distribution

Prerequisites

  • Minimum 8GB RAM total (2GB per node) for production workloads
  • Port 9042 available for CQL client connections
  • Understanding of Cassandra's eventual consistency model and CAP theorem
  • Basic knowledge of CQL syntax and data modeling concepts
  • Familiarity with nodetool commands for cluster management
  • At least 20GB free disk space for initial data storage per node

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 cassandra-1:
3 image: cassandra:4.1
4 container_name: cassandra-1
5 restart: unless-stopped
6 environment:
7 CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
8 CASSANDRA_SEEDS: cassandra-1
9 volumes:
10 - cassandra1_data:/var/lib/cassandra
11 ports:
12 - "9042:9042"
13 networks:
14 - cassandra-network
15
16 cassandra-2:
17 image: cassandra:4.1
18 container_name: cassandra-2
19 restart: unless-stopped
20 environment:
21 CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
22 CASSANDRA_SEEDS: cassandra-1
23 volumes:
24 - cassandra2_data:/var/lib/cassandra
25 depends_on:
26 - cassandra-1
27 networks:
28 - cassandra-network
29
30 cassandra-3:
31 image: cassandra:4.1
32 container_name: cassandra-3
33 restart: unless-stopped
34 environment:
35 CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
36 CASSANDRA_SEEDS: cassandra-1
37 volumes:
38 - cassandra3_data:/var/lib/cassandra
39 depends_on:
40 - cassandra-2
41 networks:
42 - cassandra-network
43
44volumes:
45 cassandra1_data:
46 cassandra2_data:
47 cassandra3_data:
48
49networks:
50 cassandra-network:
51 driver: bridge

.env Template

.env
1CLUSTER_NAME=MyCluster

Usage Notes

  1. 1Docs: https://cassandra.apache.org/doc/latest/
  2. 2Connect via cqlsh: docker exec -it cassandra-1 cqlsh
  3. 3Check cluster status: docker exec -it cassandra-1 nodetool status
  4. 4Wait 1-2 minutes between node starts for proper cluster formation
  5. 5Default CQL port 9042 - use for application connections
  6. 6Backup: nodetool snapshot, restore with sstableloader

Individual Services(3 services)

Copy individual services to mix and match with your existing compose files.

cassandra-1
cassandra-1:
  image: cassandra:4.1
  container_name: cassandra-1
  restart: unless-stopped
  environment:
    CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
    CASSANDRA_SEEDS: cassandra-1
  volumes:
    - cassandra1_data:/var/lib/cassandra
  ports:
    - "9042:9042"
  networks:
    - cassandra-network
cassandra-2
cassandra-2:
  image: cassandra:4.1
  container_name: cassandra-2
  restart: unless-stopped
  environment:
    CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
    CASSANDRA_SEEDS: cassandra-1
  volumes:
    - cassandra2_data:/var/lib/cassandra
  depends_on:
    - cassandra-1
  networks:
    - cassandra-network
cassandra-3
cassandra-3:
  image: cassandra:4.1
  container_name: cassandra-3
  restart: unless-stopped
  environment:
    CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
    CASSANDRA_SEEDS: cassandra-1
  volumes:
    - cassandra3_data:/var/lib/cassandra
  depends_on:
    - cassandra-2
  networks:
    - cassandra-network

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 cassandra-1:
5 image: cassandra:4.1
6 container_name: cassandra-1
7 restart: unless-stopped
8 environment:
9 CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
10 CASSANDRA_SEEDS: cassandra-1
11 volumes:
12 - cassandra1_data:/var/lib/cassandra
13 ports:
14 - "9042:9042"
15 networks:
16 - cassandra-network
17
18 cassandra-2:
19 image: cassandra:4.1
20 container_name: cassandra-2
21 restart: unless-stopped
22 environment:
23 CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
24 CASSANDRA_SEEDS: cassandra-1
25 volumes:
26 - cassandra2_data:/var/lib/cassandra
27 depends_on:
28 - cassandra-1
29 networks:
30 - cassandra-network
31
32 cassandra-3:
33 image: cassandra:4.1
34 container_name: cassandra-3
35 restart: unless-stopped
36 environment:
37 CASSANDRA_CLUSTER_NAME: ${CLUSTER_NAME}
38 CASSANDRA_SEEDS: cassandra-1
39 volumes:
40 - cassandra3_data:/var/lib/cassandra
41 depends_on:
42 - cassandra-2
43 networks:
44 - cassandra-network
45
46volumes:
47 cassandra1_data:
48 cassandra2_data:
49 cassandra3_data:
50
51networks:
52 cassandra-network:
53 driver: bridge
54EOF
55
56# 2. Create the .env file
57cat > .env << 'EOF'
58CLUSTER_NAME=MyCluster
59EOF
60
61# 3. Start the services
62docker compose up -d
63
64# 4. View logs
65docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/cassandra-cluster/run | bash

Troubleshooting

  • Node shows as DOWN in nodetool status: Check if containers started in sequence with proper delays between node initialization
  • Connection refused on port 9042: Wait 2-3 minutes after cluster startup for native transport to become available
  • Cluster forms with wrong datacenter names: Set CASSANDRA_DC and CASSANDRA_RACK environment variables explicitly
  • High memory usage or OOM errors: Increase container memory limits and tune JVM heap settings via CASSANDRA_HEAP_SIZE
  • Nodes not discovering each other: Verify CASSANDRA_SEEDS points to correct seed node hostnames within Docker network
  • Write/read timeouts during operations: Adjust consistency levels in CQL queries or increase request timeout values

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space