Abstract
There are two major lines of works, i.e., anchor-based and frame-based approaches, in the field of temporal action localization. But each line of works is inherently limited to a certain detection granularity and cannot simultaneously achieve high recall rates with accurate action boundaries. In this work, we propose a progressive cross-granularity cooperation (PCG-TAL) framework to effectively take advantage of complementarity between the anchor-based and frame-based paradigms, as well as between two-view clues (i.e., appearance and motion). Specifically, our new Anchor-Frame Cooperation (AFC) module can effectively integrate both two-granularity and two-stream knowledge at the feature and proposal levels, as well as within each AFC module and across adjacent AFC modules. Specifically, the RGB-stream AFC module and the flow-stream AFC module are stacked sequentially to form a progressive localization framework. The whole framework can be learned in an end-to-end fashion, whilst the temporal action localization performance can be gradually boosted in a progressive manner. Our newly proposed framework outperforms the state-of-the-art methods on three benchmark datasets the THUMOS14, ActivityNet v1.3 and UCF-101-24, which clearly demonstrates the effectiveness of our framework.
| Original language | English |
|---|---|
| Article number | 9298475 |
| Pages (from-to) | 2103-2113 |
| Number of pages | 11 |
| Journal | IEEE Transactions on Image Processing |
| Volume | 30 |
| DOIs | |
| State | Published - 2021 |
Keywords
- cross-granularity cooperation
- cross-stream cooperation
- Temporal action localization
Fingerprint
Dive into the research topics of 'PCG-TAL: Progressive Cross-Granularity Cooperation for Temporal Action Localization'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver