Improving Dynamic Scaling Performance of Cassandra

Saneyasu YAMAGUCHI  Yuki MORIMITSU  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.4   pp.682-692
Publication Date: 2017/04/01
Online ISSN: 1745-1361
Type of Manuscript: Special Section PAPER (Special Section on Data Engineering and Information Management)
Category: 
Keyword: 
KVS,  key-value store,  Cassandra,  page cache,  

Full Text: PDF(2.1MB)
>>Buy this Article


Summary: 
Load size for a service on the Internet changes remarkably every hour. Thus, it is expected for service system scales to change dynamically according to load size. KVS (key-value store) is a scalable DBMS (database management system) widely used in largescale Internet services. In this paper, we focus on Cassandra, a popular open-source KVS implementation, and discuss methods for improving dynamic scaling performance. First, we evaluate node joining time, which is the time to complete adding a node to a running KVS system, and show that its bottleneck process is disk I/O. Second, we analyze disk accesses in the nodes and indicate that some heavily accessed files cause a large number of disk accesses. Third, we propose two methods for improving elasticity, which means decreasing node adding and removing time, of Cassandra. One method reduces disk accesses significantly by keeping the heavily accessed file in the page cache. The other method optimizes I/O scheduler behavior. Lastly, we evaluate elasticity of our methods. Our experimental results demonstrate that the methods can improve the scaling-up and scaling-down performance of Cassandra.