TY - JOUR
T1 - Exploring the Simplification Limit of Deep Network Features With Subway Positioning Task
AU - Song, Jiajie
AU - Song, Ningfang
AU - Cheng, Jingchun
AU - Liu, Xiaoxin
AU - Pan, Xiong
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2025
Y1 - 2025
N2 - This paper addresses vision-based subway positioning, a significant yet challenging task due to the low-lighting and sparse-texture conditions in tunnels. Traditional features struggle with temporal correspondence. While deep network features are effective, their computational and storage demands make them unsuitable for on-board systems. We propose a simple-structured feature extractor, trained via a student-teacher distillation framework to inherit the powerful pattern mining and abstraction capabilities of deep networks. Our goal is to simplify deep network features for fixed-route applications like subway positioning, developing an on-board efficient feature extractor for practical applications. Specifically, we design a single-layer convolution operator as our student network. Through discriminability augmented distillation, we compress the feature extraction capabilities of the state-of-the-art SiLK into this compact model, achieving an optimal balance between descriptive power and computational efficiency. Our method achieves a model size of 2 KB and a processing speed of 1453 FPS, while maintaining high homography estimation accuracy comparable to those of deep network features. Extensive experiments on the vision-based subway positioning dataset show our method offers superior speed and deployability without losing accuracy.
AB - This paper addresses vision-based subway positioning, a significant yet challenging task due to the low-lighting and sparse-texture conditions in tunnels. Traditional features struggle with temporal correspondence. While deep network features are effective, their computational and storage demands make them unsuitable for on-board systems. We propose a simple-structured feature extractor, trained via a student-teacher distillation framework to inherit the powerful pattern mining and abstraction capabilities of deep networks. Our goal is to simplify deep network features for fixed-route applications like subway positioning, developing an on-board efficient feature extractor for practical applications. Specifically, we design a single-layer convolution operator as our student network. Through discriminability augmented distillation, we compress the feature extraction capabilities of the state-of-the-art SiLK into this compact model, achieving an optimal balance between descriptive power and computational efficiency. Our method achieves a model size of 2 KB and a processing speed of 1453 FPS, while maintaining high homography estimation accuracy comparable to those of deep network features. Extensive experiments on the vision-based subway positioning dataset show our method offers superior speed and deployability without losing accuracy.
KW - Feature extraction
KW - convolution operator
KW - knowledge distillation
KW - subway environments
KW - visual odometer
UR - https://www.scopus.com/pages/publications/105002489995
U2 - 10.1109/LRA.2025.3555790
DO - 10.1109/LRA.2025.3555790
M3 - 文章
AN - SCOPUS:105002489995
SN - 2377-3766
VL - 10
SP - 4922
EP - 4929
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 5
ER -