TY - JOUR
T1 - Learning to Segment Video Object with Accurate Boundaries
AU - Cheng, Jingchun
AU - Yuan, Yuhui
AU - Li, Yali
AU - Wang, Jingdong
AU - Wang, Shengjin
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Video object segmentation has attracted considerable research interest these years. Top-performing video object segmentation methods mainly rely on fully convolutional neural networks which are specifically trained for predicting high-performance masks, resulting in a lack of preciseness in boundary details. This paper tackles the problem of predicting both mask-accurate and boundary-precise segmentation masks in videos. To solve this problem, we propose a simple and efficient network structure: the Mask-boundAry-Consistent Network (MAC-Net). The MAC-Net is an end-to-end fully convolutional network, where both mask and boundaries are jointly optimized during training, enabling it to predict masks along with accurate boundaries. An inner-net boundary-computing module is incorporated in the MAC-Net for producing spontaneously mask-consistent boundaries. We analyze the influence of parameter settings, network constructions of the MAC-Net, and compare with state-of-the-art algorithms on three widely-adopted datasets. Experimental results show that the MAC-Net achieves state-of-the-art performance, demonstrating the effectiveness of its mask-boundary-consistent network structure. We also propose that the boundary module in MAC-Net has high compatibility, and can be easily adapted to other segmentation-related techniques.
AB - Video object segmentation has attracted considerable research interest these years. Top-performing video object segmentation methods mainly rely on fully convolutional neural networks which are specifically trained for predicting high-performance masks, resulting in a lack of preciseness in boundary details. This paper tackles the problem of predicting both mask-accurate and boundary-precise segmentation masks in videos. To solve this problem, we propose a simple and efficient network structure: the Mask-boundAry-Consistent Network (MAC-Net). The MAC-Net is an end-to-end fully convolutional network, where both mask and boundaries are jointly optimized during training, enabling it to predict masks along with accurate boundaries. An inner-net boundary-computing module is incorporated in the MAC-Net for producing spontaneously mask-consistent boundaries. We analyze the influence of parameter settings, network constructions of the MAC-Net, and compare with state-of-the-art algorithms on three widely-adopted datasets. Experimental results show that the MAC-Net achieves state-of-the-art performance, demonstrating the effectiveness of its mask-boundary-consistent network structure. We also propose that the boundary module in MAC-Net has high compatibility, and can be easily adapted to other segmentation-related techniques.
KW - Mask-boundAry-Consistent network
KW - convolutional neural networks
KW - joint learning
KW - video object segmentation
UR - https://www.scopus.com/pages/publications/85090448357
U2 - 10.1109/TMM.2020.3020698
DO - 10.1109/TMM.2020.3020698
M3 - 文章
AN - SCOPUS:85090448357
SN - 1520-9210
VL - 23
SP - 3112
EP - 3123
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -