A Video Salient Region Detection Framework Using Spatiotemporal Consistency Optimization

Yunfei ZHENG  Xiongwei ZHANG  Lei BAO  Tieyong CAO  Yonggang HU  Meng SUN  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E100-A   No.2   pp.688-701
Publication Date: 2017/02/01
Online ISSN: 1745-1337
Type of Manuscript: PAPER
Category: Image
Keyword: 
spatiotemporal consistency optimization,  spatial saliency,  temporal saliency,  video salient region detection,  

Full Text: PDF(5.5MB)
>>Buy this Article


Summary: 
Labeling a salient region accurately in video with cluttered background and complex motion condition is still a challenging work. Most existing video salient region detection models mainly extract the stimulus-driven saliency features to detect the salient region in video. They are easily influenced by the cluttered background and complex motion conditions. It may lead to incomplete or wrong detection results. In this paper, we propose a video salient region detection framework by fusing the stimulus-driven saliency features and spatiotemporal consistency cue to improve the performance of detection under these complex conditions. On one hand, stimulus-driven spatial saliency features and temporal saliency features are extracted effectively to derive the initial spatial and temporal salient region map. On the other hand, in order to make use of the spatiotemporal consistency cue, an effective spatiotemporal consistency optimization model is presented. We use this model optimize the initial spatial and temporal salient region map. Then the superpixel-level spatiotemporal salient region map is derived by optimizing the initial spatiotemporal salient region map. Finally, the pixel-level spatiotemporal salient region map is derived by solving a self-defined energy model. Experimental results on the challenging video datasets demonstrate that the proposed video salient region detection framework outperforms state-of-the-art methods.