Skip to main navigation Skip to search Skip to main content

RecapNet: Action Proposal Generation Mimicking Human Cognitive Process

  • Tian Wang*
  • , Yang Chen
  • , Zhiwei Lin
  • , Aichun Zhu
  • , Yong Li
  • , Hichem Snoussi
  • , Hui Wang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Generating action proposals in untrimmed videos is a challenging task, since video sequences usually contain lots of irrelevant contents and the duration of an action instance is arbitrary. The quality of action proposals is key to action detection performance. The previous methods mainly rely on sliding windows or anchor boxes to cover all ground-truth actions, but this is infeasible and computationally inefficient. To this end, this article proposes a RecapNet - a novel framework for generating action proposal, by mimicking the human cognitive process of understanding video content. Specifically, this RecapNet includes a residual causal convolution module to build a short memory of the past events, based on which the joint probability actionness density ranking mechanism is designed to retrieve the action proposals. The RecapNet can handle videos with arbitrary length and more important, a video sequence will need to be processed only in one single pass in order to generate all action proposals. The experiments show that the proposed RecapNet outperforms the state of the art under all metrics on the benchmark THUMOS14 and ActivityNet-1.3 datasets. The code is available publicly at https://github.com/tianwangbuaa/RecapNet.

Original languageEnglish
Pages (from-to)6017-6028
Number of pages12
JournalIEEE Transactions on Cybernetics
Volume51
Issue number12
DOIs
StatePublished - 1 Dec 2021

Keywords

  • Action detection
  • action proposal
  • residual causal convolution

Fingerprint

Dive into the research topics of 'RecapNet: Action Proposal Generation Mimicking Human Cognitive Process'. Together they form a unique fingerprint.

Cite this