TY - GEN
T1 - HFDNet
T2 - 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025
AU - Xie, Hongbo
AU - Zhao, Qi
AU - Liu, Binghao
AU - Wang, Chunlei
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Currently, most underwater operations are conducted in deep water, and there is usually insufficient illumination in these areas. At this time, the local texture features of some objects are highly similar in images, and it is difficult to distinguish the inter-class boundaries. This typically results in poor performance of the current semantic segmentation models of terrestrial images in underwater scenes. Taking advantage of the general characteristic that high-frequency regions are more likely to correspond to semantic segmentation boundaries, we introduce the high-frequency Divergence Attention Network (HFDNet), a semantic segmentation model based on transformer. HFDNet extracts its frequency distribution by analyzing the frequency domain of the feature map, and then calculates the relative spectral magnitude of each component by comparing its frequency amplitude against the average amplitude within its local neighborhood in the frequency domain. The local frequency map can be incorporated into the attention matrix as a weighting factor to realize the divergence of attention to the surrounding areas, which improves the attention to the high-frequency areas. This operation can enhance the model's focus on the object boundary region and local neigh-borhood categories for each component. Therefore, our model can alleviate the problem of determining the object boundary caused by insufficient light in underwater image segmentation, and enhance the ability to segment objects with similar local features under low light conditions. We conduct comprehensive experiments on three underwater segmentation datasets: Caveseg, SUIM and UWS. The results show that our HFDNet achieves state-of-the-art (SOTA) performance on the testing datasets. The source code is available at https://github.com/cv516Buaa/HongboXie/tree/main/HFDNet.
AB - Currently, most underwater operations are conducted in deep water, and there is usually insufficient illumination in these areas. At this time, the local texture features of some objects are highly similar in images, and it is difficult to distinguish the inter-class boundaries. This typically results in poor performance of the current semantic segmentation models of terrestrial images in underwater scenes. Taking advantage of the general characteristic that high-frequency regions are more likely to correspond to semantic segmentation boundaries, we introduce the high-frequency Divergence Attention Network (HFDNet), a semantic segmentation model based on transformer. HFDNet extracts its frequency distribution by analyzing the frequency domain of the feature map, and then calculates the relative spectral magnitude of each component by comparing its frequency amplitude against the average amplitude within its local neighborhood in the frequency domain. The local frequency map can be incorporated into the attention matrix as a weighting factor to realize the divergence of attention to the surrounding areas, which improves the attention to the high-frequency areas. This operation can enhance the model's focus on the object boundary region and local neigh-borhood categories for each component. Therefore, our model can alleviate the problem of determining the object boundary caused by insufficient light in underwater image segmentation, and enhance the ability to segment objects with similar local features under low light conditions. We conduct comprehensive experiments on three underwater segmentation datasets: Caveseg, SUIM and UWS. The results show that our HFDNet achieves state-of-the-art (SOTA) performance on the testing datasets. The source code is available at https://github.com/cv516Buaa/HongboXie/tree/main/HFDNet.
UR - https://www.scopus.com/pages/publications/105029957117
U2 - 10.1109/IROS60139.2025.11247710
DO - 10.1109/IROS60139.2025.11247710
M3 - 会议稿件
AN - SCOPUS:105029957117
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 11524
EP - 11530
BT - IROS 2025 - 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, Conference Proceedings
A2 - Laugier, Christian
A2 - Renzaglia, Alessandro
A2 - Atanasov, Nikolay
A2 - Birchfield, Stan
A2 - Cielniak, Grzegorz
A2 - De Mattos, Leonardo
A2 - Fiorini, Laura
A2 - Giguere, Philippe
A2 - Hashimoto, Kenji
A2 - Ibanez-Guzman, Javier
A2 - Kamegawa, Tetsushi
A2 - Lee, Jinoh
A2 - Loianno, Giuseppe
A2 - Luck, Kevin
A2 - Maruyama, Hisataka
A2 - Martinet, Philippe
A2 - Moradi, Hadi
A2 - Nunes, Urbano
A2 - Pettre, Julien
A2 - Pretto, Alberto
A2 - Ranzani, Tommaso
A2 - Ronnau, Arne
A2 - Rossi, Silvia
A2 - Rouse, Elliott
A2 - Ruggiero, Fabio
A2 - Simonin, Olivier
A2 - Wang, Danwei
A2 - Yang, Ming
A2 - Yoshida, Eiichi
A2 - Zhao, Huijing
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 October 2025 through 25 October 2025
ER -