TY - GEN
T1 - Learning Dynamic GMM for Attention Distribution on Single-Face Videos
AU - Ren, Yun
AU - Wang, Zulin
AU - Xu, Mai
AU - Dong, Haoyu
AU - Li, Shengxi
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/22
Y1 - 2017/8/22
N2 - The past decade has witnessed the popularity of video conferencing, such as FaceTime and Skype. In video conferencing, almost every frame has a human face. Hence, it is necessary to predict attention on face videos by saliency detection, as saliency can be used as a guidance of regionof- interest (ROI) for the content-based applications. To this end, this paper proposes a novel approach for saliency detection in single-face videos. From the data-driven perspective, we first establish an eye tracking database which contains fixations of 70 single-face videos viewed by 40 subjects. Through analysis on our database, we investigate that most attention is attracted by face in videos, and that attention distribution within a face varies with regard to face size and mouth movement. Inspired by the previous work which applies Gaussian mixture model (GMM) for face saliency detection in still images, we propose to model visual attention on face region for videos by dynamic GMM (DGMM), the variation of which relies on face size, mouth movement and facial landmarks. Then, we develop a long shortterm memory (LSTM) neural network in estimating DGMM for saliency detection of single-face videos, so called LSTM-DGMM. Finally, the experimental results show that our approach outperforms other state-of-the-art approaches in saliency detection of single-face videos.
AB - The past decade has witnessed the popularity of video conferencing, such as FaceTime and Skype. In video conferencing, almost every frame has a human face. Hence, it is necessary to predict attention on face videos by saliency detection, as saliency can be used as a guidance of regionof- interest (ROI) for the content-based applications. To this end, this paper proposes a novel approach for saliency detection in single-face videos. From the data-driven perspective, we first establish an eye tracking database which contains fixations of 70 single-face videos viewed by 40 subjects. Through analysis on our database, we investigate that most attention is attracted by face in videos, and that attention distribution within a face varies with regard to face size and mouth movement. Inspired by the previous work which applies Gaussian mixture model (GMM) for face saliency detection in still images, we propose to model visual attention on face region for videos by dynamic GMM (DGMM), the variation of which relies on face size, mouth movement and facial landmarks. Then, we develop a long shortterm memory (LSTM) neural network in estimating DGMM for saliency detection of single-face videos, so called LSTM-DGMM. Finally, the experimental results show that our approach outperforms other state-of-the-art approaches in saliency detection of single-face videos.
UR - https://www.scopus.com/pages/publications/85030224678
U2 - 10.1109/CVPRW.2017.208
DO - 10.1109/CVPRW.2017.208
M3 - 会议稿件
AN - SCOPUS:85030224678
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 1632
EP - 1641
BT - Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2017
PB - IEEE Computer Society
T2 - 30th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2017
Y2 - 21 July 2017 through 26 July 2017
ER -