Discovery of Regular and Irregular Spatio-Temporal Patterns from Location-Based SNS by Diffusion-Type Estimation

Yoshitatsu MATSUDA  Kazunori YAMAGUCHI  Ken-ichiro NISHIOKA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E98-D   No.9   pp.1675-1682
Publication Date: 2015/09/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015EDP7095
Type of Manuscript: PAPER
Category: Artificial Intelligence, Data Mining
Keyword: 
data mining,  SNS,  spatio-temporal pattern,  diffusion-type formula,  Bayesian estimation,  principal component analysis,  

Full Text: PDF(975.3KB)>>
Buy this Article




Summary: 
In this paper, a new approach is proposed for extracting the spatio-temporal patterns from a location-based social networking system (SNS) such as Foursquare. The proposed approach consists of the following procedures. First, the spatio-temporal behaviors of users in SNS are approximated as a probabilistic distribution by using a diffusion-type formula. Since the SNS datasets generally consist of sparse check-in's of users at some time points and locations, it is difficult to investigate the spatio-temporal patterns on a wide range of time and space scales. The proposed method can estimate such wide range patterns by smoothing the sparse datasets by a diffusion-type formula. It is crucial in this method to estimate robustly the scale parameter by giving a prior generative model on check-in's of users. The robust estimation enables the method to extract appropriate patterns even in small local areas. Next, the covariance matrix among the time points is calculated from the estimated distribution. Then, the principal eigenfunctions are approximately extracted as the spatio-temporal patterns by principal component analysis (PCA). The distribution is a mixture of various patterns, some of which are regular ones with a periodic cycle and some of which are irregular ones corresponding to transient events. Though it is generally difficult to separate such complicated mixtures, the experiments on an actual Foursquare dataset showed that the proposed method can extract many plausible and interesting spatio-temporal patterns.