Grouping Methods for Pattern Matching over Probabilistic Data Streams

Kento SUGIURA  Yoshiharu ISHIKAWA  Yuya SASAKI  

IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.4   pp.718-729
Publication Date: 2017/04/01
Online ISSN: 1745-1361
Type of Manuscript: Special Section PAPER (Special Section on Data Engineering and Information Management)
probabilistic data streams,  complex event processing,  pattern matching,  grouping,  

Full Text: PDF(1.2MB)
>>Buy this Article

As the development of sensor and machine learning technologies has progressed, it has become increasingly important to detect patterns from probabilistic data streams. In this paper, we focus on complex event processing based on pattern matching. When we apply pattern matching to probabilistic data streams, numerous matches may be detected at the same time interval because of the uncertainty of data. Although existing methods distinguish between such matches, they may derive inappropriate results when some of the matches correspond to the real-world event that has occurred during the time interval. Thus, we propose two grouping methods for matches. Our methods output groups that indicate the occurrence of complex events during the given time intervals. In this paper, first we describe the definition of groups based on temporal overlap, and propose two grouping algorithms, introducing the notions of complete overlap and single overlap. Then, we propose an efficient approach for calculating the occurrence probabilities of groups by using deterministic finite automata that are generated from the query patterns. Finally, we empirically evaluate the effectiveness of our methods by applying them to real and synthetic datasets.