TY - JOUR
T1 - A Malleable Boundary Network for temporal action detection
AU - Wang, Tian
AU - Hou, Boyao
AU - Li, Zexian
AU - Li, Zhe
AU - Huang, Lei
AU - Zhang, Baochang
AU - Snoussi, Hichem
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2022/10
Y1 - 2022/10
N2 - Temporal action detection in untrimmed videos is a challenging task aiming to predict the boundary and category of action instances. It can be useful in transportation. In this study, we propose a two-stage framework Malleable Boundary Network (MB-Net) to adaptively regress proposals based on finer scores. In particular, MB-Net consists of a Potential Boundary Generator in the first stage and an Adaptive Proposal Detector in the second stage. First, the Potential Boundary Generator fuses multiple sets of flexible score sequences to obtain tentative proposals through a frame-level feature in an anchor-free way. Then, the Adaptive Proposal Detector employs parallel modules to filter, classify and regress proposals adaptively. Besides, we propose an easy-to-realize feature augmented method Structured Temporal Segment Pooling, which makes full use of the information throughout the whole proposal. Experiments show that MB-Net achieves state-of-the-art performance on popular benchmarks THUMOS-14 and Activity-1.3 with an improvement of 1.9% and 1.2%.
AB - Temporal action detection in untrimmed videos is a challenging task aiming to predict the boundary and category of action instances. It can be useful in transportation. In this study, we propose a two-stage framework Malleable Boundary Network (MB-Net) to adaptively regress proposals based on finer scores. In particular, MB-Net consists of a Potential Boundary Generator in the first stage and an Adaptive Proposal Detector in the second stage. First, the Potential Boundary Generator fuses multiple sets of flexible score sequences to obtain tentative proposals through a frame-level feature in an anchor-free way. Then, the Adaptive Proposal Detector employs parallel modules to filter, classify and regress proposals adaptively. Besides, we propose an easy-to-realize feature augmented method Structured Temporal Segment Pooling, which makes full use of the information throughout the whole proposal. Experiments show that MB-Net achieves state-of-the-art performance on popular benchmarks THUMOS-14 and Activity-1.3 with an improvement of 1.9% and 1.2%.
KW - MB-Net
KW - Temporal action detection
KW - Temporal action proposal generation
KW - Untrimmed video
UR - https://www.scopus.com/pages/publications/85136478651
U2 - 10.1016/j.compeleceng.2022.108250
DO - 10.1016/j.compeleceng.2022.108250
M3 - 文章
AN - SCOPUS:85136478651
SN - 0045-7906
VL - 103
JO - Computers and Electrical Engineering
JF - Computers and Electrical Engineering
M1 - 108250
ER -