Skip to main navigation Skip to search Skip to main content

PCG-TAL: Progressive Cross-Granularity Cooperation for Temporal Action Localization

  • Rui Su
  • , Dong Xu*
  • , Lu Sheng
  • , Wanli Ouyang
  • *Corresponding author for this work
  • The University of Sydney

Research output: Contribution to journalArticlepeer-review

Abstract

There are two major lines of works, i.e., anchor-based and frame-based approaches, in the field of temporal action localization. But each line of works is inherently limited to a certain detection granularity and cannot simultaneously achieve high recall rates with accurate action boundaries. In this work, we propose a progressive cross-granularity cooperation (PCG-TAL) framework to effectively take advantage of complementarity between the anchor-based and frame-based paradigms, as well as between two-view clues (i.e., appearance and motion). Specifically, our new Anchor-Frame Cooperation (AFC) module can effectively integrate both two-granularity and two-stream knowledge at the feature and proposal levels, as well as within each AFC module and across adjacent AFC modules. Specifically, the RGB-stream AFC module and the flow-stream AFC module are stacked sequentially to form a progressive localization framework. The whole framework can be learned in an end-to-end fashion, whilst the temporal action localization performance can be gradually boosted in a progressive manner. Our newly proposed framework outperforms the state-of-the-art methods on three benchmark datasets the THUMOS14, ActivityNet v1.3 and UCF-101-24, which clearly demonstrates the effectiveness of our framework.

Original languageEnglish
Article number9298475
Pages (from-to)2103-2113
Number of pages11
JournalIEEE Transactions on Image Processing
Volume30
DOIs
StatePublished - 2021

Keywords

  • cross-granularity cooperation
  • cross-stream cooperation
  • Temporal action localization

Fingerprint

Dive into the research topics of 'PCG-TAL: Progressive Cross-Granularity Cooperation for Temporal Action Localization'. Together they form a unique fingerprint.

Cite this