G-HBase: A High Performance Geographical Database Based on HBase

Hong Van LE  Atsuhiro TAKASU  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.4   pp.1053-1065
Publication Date: 2018/04/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017DAP0017
Type of Manuscript: Special Section PAPER (Special Section on Data Engineering and Information Management)
HBase,  high performance,  spatial index,  GeoHash,  BGR Partitioning,  

Full Text: PDF>>
Buy this Article

With the recent explosion of geographic data generated by smartphones, sensors, and satellites, a data storage that can handle the massive volume of data and support high-computational spatial queries is becoming essential. Although key-value stores efficiently handle large-scale data, they are not equipped with effective functions for supporting geographic data. To solve this problem, in this paper, we present G-HBase, a high-performance geographical database based on HBase, a standard key-value store. To index geographic data, we first use Geohash as the rowkey in HBase. Then, we present a novel partitioning method, namely binary Geohash rectangle partitioning, to support spatial queries. Our extensive experiments on real datasets have demonstrated an improved performance with k nearest neighbors and range query in G-HBase when compared with SpatialHadoop, a state-of-the-art framework with native support for spatial data. We also observed that performance of spatial join in G-HBase is on par with SpatialHadoop and outperforms SJMR algorithm in HBase.