Apache Cassandra is a open source distributed, high performance, extremely scalable and fault tolerant post-relational database solution. It can serve as both real-time data store for online/transactional applications, and as read-intensive database for business intelligence systems.
Relational DB Vs Cassandra
Relational database systems handles moderate incoming data velocity and fetches data from one or few locations. It manages primarily structured data and supports complex/nested transactions with single points of failure with fail over.
Cassandra handles high incoming data velocity by fetching data from many locations. It manages all data types and supports simple transactions with no single points of failure; it provides constant uptime. In addition, it provides read/write scalability.
In this article, I'm providing the guidelines on how I installed Apache Cassandra and ran a single-node cluster on my Ubuntu 16.04 server.
- It requires a Java Platform to run
- A user to run this application
Cassandra needs Java application to be running on your server, make sure you have installed latest Java version. You can update the APT repository packages and install Java. Cassandra 3 or later requires Java 8+ version to be installed.
root@ubuntu:~# apt-get update
root@ubuntu:~# apt-get install default-jdk
Setting up default-jdk (2:1.8-56ubuntu2) ...
Setting up gconf-service-backend (3.2.6-3ubuntu6) ...
Setting up gconf2 (3.2.6-3ubuntu6) ...
Setting up libgnomevfs2-common (1:2.24.4-6.1ubuntu1) ...
Setting up libgnomevfs2-0:amd64 (1:2.24.4-6.1ubuntu1) ...
Setting up libgnome2-common (2.32.1-5ubuntu1) ...
Setting up libgnome-2-0:amd64 (2.32.1-5ubuntu1) ...
Processing triggers for libc-bin (2.23-0ubuntu3) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for systemd (229-4ubuntu4) ...
Processing triggers for ca-certificates (20160104ubuntu1) ...
Updating certificates in /etc/ssl/certs...
0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d...
You can confirm the Java version installed.
root@ubuntu:~# java -version
openjdk version "1.8.0_91"
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-0ubuntu4~16.04.1-b14)
OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)
Creating a user to run Cassandra
It is always recommended to run this application as a user instead of root. Hence, I created my Cassandra user to run this application.
root@ubuntu:~# groupadd cassandra
root@ubuntu:~# useradd -d /home/cassandra -s /bin/bash -m -g cassandra cassandra
root@ubuntu:~# grep cassandra /etc/passwd
Download and Install Cassandra
Now we can download the latest Apache Cassandra from here and copy to your preferred directory. I downloaded this tar file to my /tmp folder and extracted the contents to my cassandra "home" there.
root@ubuntu:/tmp# wget http://mirror.cc.columbia.edu/pub/software/apache/cassandra/3.6/apache-cassandra-3.6-bin.tar.gz
--2016-06-12 08:36:47-- http://mirror.cc.columbia.edu/pub/software/apache/cassandra/3.6/apache-cassandra-3.6-bin.tar.gz
Resolving mirror.cc.columbia.edu (mirror.cc.columbia.edu)... 22.214.171.124
Connecting to mirror.cc.columbia.edu (mirror.cc.columbia.edu)|126.96.36.199|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 35552323 (34M) [application/x-gzip] Saving to: ‘apache-cassandra-3.6-bin.tar.gz’
apache-cassandra-3.6-bin.tar.gz 100%[===================================================================>] 33.91M 6.43MB/s in 12s
2016-06-12 08:37:01 (2.93 MB/s) - ‘apache-cassandra-3.6-bin.tar.gz’ saved [35552323/35552323]
root@ubuntu:/tmp# tar -xvf apache-cassandra-3.6-bin.tar.gz -C /home/cassandra --strip-components=1
Correcting the ownerships and setting variables
You can correct the ownerships for the files and set proper environment variables to run this application smoothly.
root@ubuntu:/home/cassandra# export CASSANDRA_HOME=/home/cassandra
root@ubuntu:/home/cassandra# export PATH=$PATH:$CASSANDRA_HOME/bin
root@ubuntu:/home/cassandra# chown -R cassandra.cassandra .
Now you can switch to the cassandra user and run this application as below:
cassandra@ubuntu:~$ sh bin/cassandra
INFO 09:10:39 Cassandra version: 3.6
INFO 09:10:39 Thrift API version: 20.1.0
INFO 09:10:39 CQL supported versions: 3.4.2 (default: 3.4.2)
INFO 09:10:39 Initializing index summary manager with a memory pool size of 24 MB and a resize interval of 60 minutes
INFO 09:10:39 Starting Messaging Service on localhost/127.0.0.1:7000 (lo)
INFO 09:10:39 Loading persisted ring state
INFO 09:10:39 Starting up server gossip
INFO 09:10:39 Updating topology for localhost/127.0.0.1
INFO 09:10:39 Updating topology for localhost/127.0.0.1
INFO 09:10:39 Node localhost/127.0.0.1 state jump to NORMAL
This output means, your Cassandra server is up and running fine now. Now we can check and confirm the status of our Cluster by this command.
root@ubuntu:/home/cassandra# nodetool status
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 142.65 KiB 256 100.0% fc76be14-acde-47d4-a4a2-5d015804bb3c rack1
The status and state notation UN means it is up and normal.
We are done with installing Single Node Cassandra cluster. Now we can see how to connect to our cluster.
Connecting to our Cluster
We can execute this shell script "cqlsh" to connect to our cluster node.
These are the various CQL commands used in Cassandra. You can get more information on how to use this here.
Howdy! we're done with a Single-Node Cassandra Cluster in our Ubuntu 16.04 server. I hope you enjoyed reading this. I would recommend your valuable comments and suggestions on this.