Building Hierarchical Spatial Histograms for Exploratory Analysis in Array DBMS

Jing ZHAO  Yoshiharu ISHIKAWA  Lei CHEN  Chuan XIAO  Kento SUGIURA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.4   pp.788-799
Publication Date: 2019/04/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018DAP0020
Type of Manuscript: Special Section PAPER (Special Section on Data Engineering and Information Management)
Category: 
Keyword: 
spatial histograms,  exploratory analysis,  array DBMSs,  

Full Text: PDF(1.1MB)>>
Buy this Article




Summary: 
As big data attracts attention in a variety of fields, research on data exploration for analyzing large-scale scientific data has gained popularity. To support exploratory analysis of scientific data, effective summarization and visualization of the target data as well as seamless cooperation with modern data management systems are in demand. In this paper, we focus on the exploration-based analysis of scientific array data, and define a spatial V-Optimal histogram to summarize it based on the notion of histograms in the database research area. We propose histogram construction approaches based on a general hierarchical partitioning as well as a more specific one, the l-grid partitioning, for effective and efficient data visualization in scientific data analysis. In addition, we implement the proposed algorithms on the state-of-the-art array DBMS, which is appropriate to process and manage scientific data. Experiments are conducted using massive evacuation simulation data in tsunami disasters, real taxi data as well as synthetic data, to verify the effectiveness and efficiency of our methods.