Tag Archives: NoSQL

Cassandra cluster

It’s live and it’s quick!

I helped to implement a Cassandra cluster inside an Amazon VPC earlier this year. It worked fine if a bit slow. We tried increasing the number of nodes (Scale out) and we tried larger nodes (Scale up) along with tuning Cassandra and the application. In the end we could make the application go a bit faster but the connection to the Cassandra cluster seemed to be the limiting factor. The decision was made for speed reasons to get the Cassandra nodes into our rack so that we had a fast link between the application server and the Cassandra Cluster.

We had loads of fun testing configuration options to get a well balanced specification of server while fitting inside the power and financial budget.

We had a specialist from Acunu for a quick training session with us to ensure we have a good grasp of the pretty specialist requirements of Cassandra; its tuning; Maintenance and the underlying technology. This helped us to come to understand the magic triad that needs to be balanced to run a node effectively.

The Triad of per node balance.

Memory – Minimum of 8GB of RAM, 4GB for Java heap and 4 GB of Cache. The Maximum (not a hard limit as you will see) is 16GB of RAM, 8GB for the Java heap and 8GB for the Cache. The Maximum is 16GB because if the heap is too large then the full Java Garbage Collection which pauses the all process will take too long and the node will fall out of the cluster. In our experience a Dual socket Quad core system (8 Hardware threads) can pause for upwards of 10 seconds in a production machine doing a full garbage collection of 8GB of RAM.

Processor – Quad core is a minimum. It mainly effects the garbage collection and SSTables compaction times but these are what will cause your cluster to do funny things like drop a node for a few minutes.

Disk – Disks are a must not SSD’s. Under the hood Cassandra does all of its writing to disk in a serial fashion which is optimized for traditional spinning media. The commit log should have its own disk as it is effectively being written to continuously and this is why an SSD will not last long. The data disk needs high throughput as well as fast access this will allow the Memtables to be flushed to disk as quickly as possible reducing the impact on other actions.

In the end we bought systems with the following specs

  •  1 * 6 core (12 including hyper threadding) Xeon processor running at 2.0 Ghz
  • 16GB of Ram (Quad channel DDR3)
  • 1 OS disk 7,200 RPM
  • 1 Commit log 10,000 RPM disk
  • 3 Data disks in a hardware stripe 10,000 RPM disks

Each node runs at less than 300watts peak and Cassandra shows an avg write latency of 2.2ms!

The performance increase of moving the cluster closer to the application has been huge. One report that would take a minute upwards in AWS (Returning a couple of hundred megabytes of data) now runs in under 10 seconds when the Cassandra system is cold and under 2 when the Cassandra cluster has had time to warm up the Cache.

The one thing that is most important to remember with Cassandra is that Cassandra is a write bias data store. This means that Writes are very fast, generally faster than retrieval so you have to think differently when writing an application that uses Cassandra. It is better to write a new value that overwrites an old one than it is to load an old value then update it with the new value. Get the data model right and the use case for Cassandra will guide you to a new world of highly available fast data storage. If your workload is read biased then Cassandra may not be for you.