Docker is the best tool to quickly check new technologies without the hassles of installations.
In this post, we will write a simple docker compose file that will fire a 3 nodes Cassandra cluster.
Docker-Compose file
The first step is of course to have docker installed on your system.
A newer version of this docker-compose using the version 3 of docker-compose is available here.
Create a text file, name it docker-compose.yml and copy/paste the following text via your preferred text tool.
version: '2' services: ############################### cassandra0: image: cassandra container_name: cassandra0 ports: - 9042:9042 - 7199:7199 ############################### cassandra1: image: cassandra container_name: cassandra1 ports: - 9142:9042 links: - cassandra0:seed environment: - CASSANDRA_SEEDS=seed ############################### cassandra2: image: cassandra container_name: cassandra2 ports: - 9242:9042 links: - cassandra0:seed environment: - CASSANDRA_SEEDS=seed
This docker files defines three containers. The cassandra0 container is the one used as main seed. The two others link it and get the seed via the CASSANDRA_SEEDS environment variable which is set to “seed” which is an alias to the cassandra0 container.
Start the containers via the following command:
docker-compose up -d
Checking the installation
Check the status of the first node via the following command:
docker exec cassandra0 nodetool status
You should get the same results as the ones shown in the following screenshot:
Repeating the same operation and using the other container names (cassandra1, cassandra2) should give the same results.
The important information in the results part of the command are the two letters at the beginning of each node line. It should be UN. U for Up and N for Normal.
Creating a key space via Python
Use the following python program in order to create a keyspace and a table inside it.
import logging log = logging.getLogger() log.setLevel('INFO') handler = logging.StreamHandler() handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(name)s: %(message)s")) log.addHandler(handler) from cassandra import ConsistencyLevel from cassandra.cluster import Cluster from cassandra.query import SimpleStatement KEYSPACE = "mykeyspace" def createKeySpace(): cluster = Cluster(contact_points=['127.0.0.1'],port=9142) session = cluster.connect() log.info("Creating keyspace...") try: session.execute(""" CREATE KEYSPACE %s WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '2' } """ % KEYSPACE) log.info("setting keyspace...") session.set_keyspace(KEYSPACE) log.info("creating table...") session.execute(""" CREATE TABLE mytable ( mykey text, col1 text, col2 text, PRIMARY KEY (mykey, col1) ) """) except Exception as e: log.error("Unable to create keyspace") log.error(e) createKeySpace();
You can use the cqlsh tool included in each container to check your key space using the following commands:
docker exec -it cassandra0 cqlsh
You can then use the command “describe mykeyspace” in order to get details about the newly created key space.
Let’s insert some data
We will again use python to insert a few records in our table. Add the following function to the program.
def insertData(number): cluster = Cluster(contact_points=['127.0.0.1'],port=9142) session = cluster.connect() log.info("setting keyspace...") session.set_keyspace(KEYSPACE) prepared = session.prepare(""" INSERT INTO mytable (mykey, col1, col2) VALUES (?, ?, ?) """) for i in range(number): if(i%100 == 0): log.info("inserting row %d" % i) session.execute(prepared.bind(("rec_key_%d" % i, 'aaa', 'bbb'))) insertData(1000)
Reading the freshly inserted data is not that difficult using a function similar to the one below:
def readRows(): cluster = Cluster(contact_points=['127.0.0.1'],port=9142) session = cluster.connect() log.info("setting keyspace...") session.set_keyspace(KEYSPACE) rows = session.execute("SELECT * FROM mytable") log.info("key\tcol1\tcol2") log.info("---------\t----\t----") count=0 for row in rows: if(count%100==0): log.info('\t'.join(row)) count=count+1; log.info("Total") log.info("-----") log.info("rows %d" %(count))
By stopping and starting one of the three containers, it is easy to check that the system is still available even if one of its node is shutdown. Change the port in the python program from 9142 to 9042 or 9242 in order to communicate with a particular node.
November 8, 2017 at 10:42 am
Hi I tried the above steps to setup cassandra cluster but I see below error on seed node :
INFO [main] 2017-11-08 10:37:27,297 CassandraDaemon.java:527 – Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it
INFO [OptionalTasks:1] 2017-11-08 10:37:29,259 CassandraRoleManager.java:355 – Created default superuser role ‘cassandra’
Also the logs from other node i see below error :
WARN [main] 2017-11-08 10:37:46,504 SimpleSeedProvider.java:60 – Seed provider couldn’t lookup host seed
Exception (org.apache.cassandra.exceptions.ConfigurationException) encountered during startup: The seed provider lists no seeds.
The seed provider lists no seeds.
ERROR [main] 2017-11-08 10:37:46,522 CassandraDaemon.java:706 – Exception encountered during startup: The seed provider lists no seeds.
Could you please let me know where I am going wrong ?
Thanks in advance 🙂
November 8, 2017 at 12:45 pm
I suppose that they changed something in the latest version of the cassandra image that makes this docker-compose file not work anymore. You can try to use the docker-compose of the next blog on cassandra here: https://mannekentech.com/2017/01/28/playing-with-kairos-db This docker-compose file hard code the cassandra version in the compose file.
November 9, 2017 at 11:12 am
Thanks for the quick reply…. However I tried the above blog I see only one node comes up out of 3. other 2 nodes goes down with below error :
Exception (org.apache.cassandra.exceptions.ConfigurationException) encountered during startup: The seed provider lists no seeds.
The seed provider lists no seeds.
ERROR [main] 2017-11-09 11:07:32,066 CassandraDaemon.java:706 – Exception encountered during startup: The seed provider lists no seeds.
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
57f466b42703 cassandra “/docker-entrypoin…” 7 minutes ago Up 7 minutes 7000-7001/tcp, 7199/tcp, 9160/tcp, 0.0.0.0:9142->9042/tcp cassandra1
Below is my compose.yml :
version: ‘2’
services:
###############################
cassandra0:
image: cassandra
container_name: cassandra0
ports:
– 9042:9042
– 9160:9160
– 7199:7199
– 8778:8778
environment:
– CASSANDRA_START_RPC=true
###############################
cassandra1:
image: cassandra
container_name: cassandra1
command: /bin/bash -c “echo ‘Waiting for seed node’ && sleep 30 && /docker-entrypoint.sh cassandra -f”
ports:
– 9142:9042
links:
– cassandra0:seed
environment:
– CASSANDRA_SEEDS=seed
###############################
cassandra2:
image: cassandra
container_name: cassandra2
command: /bin/bash -c “echo ‘Waiting for seed node’ && sleep 80 && /docker-entrypoint.sh cassandra -f”
ports:
– 9242:9042
links:
– cassandra0:seed
environment:
– CASSANDRA_SEEDS=seed
Could you please provide any pointer if any ?
November 11, 2017 at 5:07 pm
I rewrote a working docker-compose with three nodes, kairos and grafana. You will find the code here: https://mannekentech.com/2017/11/11/playing-with-docker-and-cassandra/
November 14, 2017 at 7:33 am
Thank you Arnaud. Its working now. I wopuld like to know why kairos is been used here? Anywhere related to Cassandra cluster?
November 14, 2017 at 7:37 am
You can remove Kairos DB, if you are not using it. It is simply a nice time series database on top of cassandra.