TY - GEN
T1 - Saliency Prediction of Traffic Surveillance Videos
T2 - 16th International Conference on Wireless Communications and Signal Processing, WCSP 2024
AU - Duan, Weichen
AU - Qiao, Minglang
AU - Jiang, Lai
AU - Xu, Mai
AU - Deng, Xin
AU - Wen, Shijie
AU - Li, Fei
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Traffic surveillance videos are crucial for the devel-opment of intelligent transportation systems. The huge number of these videos has introduced significant challenges for storage and transmission, etc. Therefore, efficient and accurate video saliency prediction (VSP) can benefit a wide range of video processing techniques for traffic scenes, such as video compression, smart navigation, and traffic event detection. However, currently there are no VSP approaches or eye-tracking datasets dedicated to traffic surveillance videos. In this paper, we establish a large-scale eye-tracking dataset, dubbed traffic surveillance videos 1K (TSV1K). TSV1K contains 1000 high-quality traffic surveil-lance videos, with eye-tracking annotations from 30 subjects. Based on our dataset, we conduct thorough analysis on the correlations between human attention and traffic scenes, e.g., vehicle distribution and scene complexity. Accordingly, we pro-pose a multi-task traffic saliency prediction network (MTTS-Net), which leverages the task of traffic salient object detection (TSOD) to promote the performance of the VSP task. In order to better learn these tasks, a two-stage training strategy is developed to progressively train the MTTS- Net. Experimental results demonstrate that our proposed approach outperforms the state-of-the-art approaches in both tasks of TSOD and VSP on traffic surveillance videos. Our dataset and code are available on https://github.com/giteec/TSV1K.
AB - Traffic surveillance videos are crucial for the devel-opment of intelligent transportation systems. The huge number of these videos has introduced significant challenges for storage and transmission, etc. Therefore, efficient and accurate video saliency prediction (VSP) can benefit a wide range of video processing techniques for traffic scenes, such as video compression, smart navigation, and traffic event detection. However, currently there are no VSP approaches or eye-tracking datasets dedicated to traffic surveillance videos. In this paper, we establish a large-scale eye-tracking dataset, dubbed traffic surveillance videos 1K (TSV1K). TSV1K contains 1000 high-quality traffic surveil-lance videos, with eye-tracking annotations from 30 subjects. Based on our dataset, we conduct thorough analysis on the correlations between human attention and traffic scenes, e.g., vehicle distribution and scene complexity. Accordingly, we pro-pose a multi-task traffic saliency prediction network (MTTS-Net), which leverages the task of traffic salient object detection (TSOD) to promote the performance of the VSP task. In order to better learn these tasks, a two-stage training strategy is developed to progressively train the MTTS- Net. Experimental results demonstrate that our proposed approach outperforms the state-of-the-art approaches in both tasks of TSOD and VSP on traffic surveillance videos. Our dataset and code are available on https://github.com/giteec/TSV1K.
KW - Multi - task learning
KW - Saliency prediction
KW - Traffic surveillance video
UR - https://www.scopus.com/pages/publications/85217546581
U2 - 10.1109/WCSP62071.2024.10826669
DO - 10.1109/WCSP62071.2024.10826669
M3 - 会议稿件
AN - SCOPUS:85217546581
T3 - 16th International Conference on Wireless Communications and Signal Processing, WCSP 2024
SP - 1355
EP - 1360
BT - 16th International Conference on Wireless Communications and Signal Processing, WCSP 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 24 October 2024 through 26 October 2024
ER -