TY - JOUR
T1 - LESFuse
T2 - A lightweight dual-domain collaborative framework for high-fidelity visible-infrared image fusion
AU - Xu, Haohao
AU - Ran, Guangsheng
AU - Cai, Yueri
AU - Wang, Yi
AU - Bi, Shusheng
N1 - Publisher Copyright:
© 2026 Elsevier B.V.
PY - 2026/5
Y1 - 2026/5
N2 - Visible-infrared image fusion is crucial for robust perception in challenging environments, yet the inherent modality gap often leads to structural distortion and detail loss. To address this, we propose LESFuse, a novel lightweight fusion paradigm that establishes a dual-domain collaborative mechanism. Our approach introduces an intensity-structure interaction model to enforce spatial consistency and a learning-based frequency decomposition strategy to disentangle and enhance multi-scale features. Extensive experiments on three public datasets (MSRS, RoadScene, and TNO) against seven state-of-the-art methods demonstrate that LESFuse achieves superior performance. Quantitatively, our method attains the highest scores across multiple metrics, achieving a Mutual Information (MI) score of 3.71 on the MSRS dataset, significantly outperforming the second-best method. In downstream object detection tasks, LESFuse yields the highest mean Average Precision (mAP@0.5) of 83.16%. Furthermore, the framework maintains exceptionally low computational costs, requiring only 0.03M parameters and 1.85 GFLOPs, with an average inference time of 0.11 s on a CPU. These results confirm LESFuse's effectiveness and real-time capability for deployment on resource-constrained platforms.
AB - Visible-infrared image fusion is crucial for robust perception in challenging environments, yet the inherent modality gap often leads to structural distortion and detail loss. To address this, we propose LESFuse, a novel lightweight fusion paradigm that establishes a dual-domain collaborative mechanism. Our approach introduces an intensity-structure interaction model to enforce spatial consistency and a learning-based frequency decomposition strategy to disentangle and enhance multi-scale features. Extensive experiments on three public datasets (MSRS, RoadScene, and TNO) against seven state-of-the-art methods demonstrate that LESFuse achieves superior performance. Quantitatively, our method attains the highest scores across multiple metrics, achieving a Mutual Information (MI) score of 3.71 on the MSRS dataset, significantly outperforming the second-best method. In downstream object detection tasks, LESFuse yields the highest mean Average Precision (mAP@0.5) of 83.16%. Furthermore, the framework maintains exceptionally low computational costs, requiring only 0.03M parameters and 1.85 GFLOPs, with an average inference time of 0.11 s on a CPU. These results confirm LESFuse's effectiveness and real-time capability for deployment on resource-constrained platforms.
KW - Frequency domain
KW - Image fusion
KW - Infrared and visible image
KW - Lightweight network
KW - Real-time image fusion
UR - https://www.scopus.com/pages/publications/105030202207
U2 - 10.1016/j.asoc.2026.114805
DO - 10.1016/j.asoc.2026.114805
M3 - 文章
AN - SCOPUS:105030202207
SN - 1568-4946
VL - 193
JO - Applied Soft Computing
JF - Applied Soft Computing
M1 - 114805
ER -