Large Displacement Dynamic Scene Segmentation through Multiscale Saliency Flow

Yinhui ZHANG  Zifen HE  

IEICE TRANSACTIONS on Information and Systems   Vol.E99-D   No.7   pp.1871-1876
Publication Date: 2016/07/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015EDP7482
Type of Manuscript: PAPER
Category: Pattern Recognition
video segmentation,  temporal propagation,  saliency map,  multiscale saliency flow,  

Full Text: PDF(832.8KB)>>
Buy this Article

Most unsupervised video segmentation algorithms are difficult to handle object extraction in dynamic real-world scenes with large displacements, as foreground hypothesis is often initialized with no explicit mutual constraint on top-down spatio-temporal coherency despite that it may be imposed to the segmentation objective. To handle such situations, we propose a multiscale saliency flow (MSF) model that jointly learns both foreground and background features of multiscale salient evidences, hence allowing temporally coherent top-down information in one frame to be propagated throughout the remaining frames. In particular, the top-down evidences are detected by combining saliency signature within a certain range of higher scales of approximation coefficients in wavelet domain. Saliency flow is then estimated by Gaussian kernel correlation of non-maximal suppressed multiscale evidences, which are characterized by HOG descriptors in a high-dimensional feature space. We build the proposed MSF model in accordance with the primary object hypothesis that jointly integrates temporal consistent constraints of saliency map estimated at multiple scales into the objective. We demonstrate the effectiveness of the proposed multiscale saliency flow for segmenting dynamic real-world scenes with large displacements caused by uniform sampling of video sequences.