
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Efficient Regular Path Query Evaluation by Splitting with UnitSubquery Cost Matrix
VanQuyet NGUYEN Kyungbaek KIM
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E100D
No.10
pp.26482652 Publication Date: 2017/10/01
Online ISSN: 17451361
DOI: 10.1587/transinf.2017EDL8060
Type of Manuscript: LETTER Category: Data Engineering, Web Information Systems Keyword: regular path queries, large graphs, graph querying,
Full Text: PDF(223.6KB) >>Buy this Article
Summary:
A widelyused query on a graph is a regular path query (RPQ) whose answer is a set of tuples of nodes connected by paths corresponding to a given regular expression. Traditionally, evaluating an RPQ on a large graph takes substantial memory spaces and long response time. Recently, several studies have focused on improving response time for evaluating an RPQ by splitting an original RPQ into smaller subqueries, evaluating them in parallel and combining partial answers. In these works, how to choose split labels in an RPQ is one of key points of the performance of RPQ evaluation, and rare labels of a graph can be used as split labels. However there is still a room for improvement, because a rare label cannot guarantee the minimum evaluation cost all the time. In this paper, we propose a novel approach of selecting split labels by estimating evaluation cost of each split subquery with a unitsubquery cost matrix (USCM), which can be obtained from a graph in prior to evaluate an RPQ. USCM presents the evaluation cost of a unitsubquery which is the smallest possible subquery, and we can estimate the evaluation cost of an RPQ by decomposing into a set of unitsubqueries. Experimental results show that our proposed approach outperforms rare label based approaches.

