TY - GEN
T1 - Learnt Mutual Feature Compression for Machine Vision
AU - Liu, Tie
AU - Xu, Mai
AU - Li, Shengxi
AU - Chen, Chaoran
AU - Yang, Li
AU - Lv, Zhuoyi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Recently, image coding for machines (ICM) has been playing an important role in facilitating intelligent vision tasks. Unfortunately, the existing ICM methods separately compress features at each scale, neglecting the redundancy across multi-scale features. To address this issue, this paper proposes an end-to-end mutual compression framework for the ICM, such that the compression efficiency can be significantly improved by removing the cross-scale redundancy. Specifically, the proposed framework consists of a mutual feature compression network (MFCNet) and a basic feature compression network (BFCNet). The MFCNet predicts large-scale features from basic small-scale features, such that the large amount of bitrates assigned to compress large-scale features can be saved. Moreover, the BFCNet is proposed to compress small-scale features of high quality by removing spatial and channel-wise redundancy. This guarantees superior performances whilst consuming extremely small amount of bit-rates. The experimental results show that our method achieves 90.10% and 74.97% BD-rate saving against the VVC feature anchor and VVC image anchor that have been recently accepted by the moving picture experts group (MPEG).
AB - Recently, image coding for machines (ICM) has been playing an important role in facilitating intelligent vision tasks. Unfortunately, the existing ICM methods separately compress features at each scale, neglecting the redundancy across multi-scale features. To address this issue, this paper proposes an end-to-end mutual compression framework for the ICM, such that the compression efficiency can be significantly improved by removing the cross-scale redundancy. Specifically, the proposed framework consists of a mutual feature compression network (MFCNet) and a basic feature compression network (BFCNet). The MFCNet predicts large-scale features from basic small-scale features, such that the large amount of bitrates assigned to compress large-scale features can be saved. Moreover, the BFCNet is proposed to compress small-scale features of high quality by removing spatial and channel-wise redundancy. This guarantees superior performances whilst consuming extremely small amount of bit-rates. The experimental results show that our method achieves 90.10% and 74.97% BD-rate saving against the VVC feature anchor and VVC image anchor that have been recently accepted by the moving picture experts group (MPEG).
KW - Learnt feature compression
KW - machine vision
KW - mutual redundancy
UR - https://www.scopus.com/pages/publications/85177565634
U2 - 10.1109/ICASSP49357.2023.10094830
DO - 10.1109/ICASSP49357.2023.10094830
M3 - 会议稿件
AN - SCOPUS:85177565634
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
BT - ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Y2 - 4 June 2023 through 10 June 2023
ER -