跳到主要导航 跳到搜索 跳到主要内容

Progressive Self-supervised Spatio-temporal Feature Learning Based on Video Sequence Saliency

  • Jinlong Kang
  • , Tao Xu
  • , Boting Qu
  • , Xiang Wang
  • , Xiaoli Lian
  • , Jing Guo
  • , Yuan Gao*
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

We observe that videos have different levels of frame/clip sequence saliency, and the effective utilization of frame/clip sequence saliency is beneficial to spatio-temporal feature learning. Therefore, we propose a new concept called video sequence saliency (VSS) to measure the degree of difficulty of models in identifying the correct frames/clip orders of videos. Accordingly, we developed a novel method named progressive self-supervised spatio-temporal feature learning based on VSS (PSSFL-VSS). For the pretext task of clip order prediction, the videos are input into networks in descending order by VSS values rather than randomly, as in traditional methods. In addition, we update the VSS value of each video based on clip order prediction results. The effectiveness of our pre-trained models is verified by carrying out the downstream tasks of clip/video retrieval and action recognition, and experimental results show that our method achieves apparent improvements over the state-of-the-art methods.

源语言英语
主期刊名Eighth International Conference on Video and Image Processing, ICVIP 2024
编辑Xuefeng Liang
出版商SPIE
ISBN(电子版)9781510689237
DOI
出版状态已出版 - 2025
活动8th International Conference on Video and Image Processing, ICVIP 2024 - Kuala Lumpur, 马来西亚
期限: 13 12月 202415 12月 2024

出版系列

姓名Proceedings of SPIE - The International Society for Optical Engineering
13558
ISSN(印刷版)0277-786X
ISSN(电子版)1996-756X

会议

会议8th International Conference on Video and Image Processing, ICVIP 2024
国家/地区马来西亚
Kuala Lumpur
时期13/12/2415/12/24

指纹

探究 'Progressive Self-supervised Spatio-temporal Feature Learning Based on Video Sequence Saliency' 的科研主题。它们共同构成独一无二的指纹。

引用此