TY - JOUR
T1 - Dif-Fusion
T2 - Toward High Color Fidelity in Infrared and Visible Image Fusion With Diffusion Models
AU - Yue, Jun
AU - Fang, Leyuan
AU - Xia, Shaobo
AU - Deng, Yue
AU - Ma, Jiayi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Color plays an important role in human visual perception, reflecting the spectrum of objects. However, the existing infrared and visible image fusion methods rarely explore how to handle multi-spectral/channel data directly and achieve high color fidelity. This paper addresses the above issue by proposing a novel method with diffusion models, termed as Dif-Fusion, to generate the distribution of the multi-channel input data, which increases the ability of multi-source information aggregation and the fidelity of colors. In specific, instead of converting multi-channel images into single-channel data in existing fusion methods, we create the multi-channel data distribution with a denoising network in a latent space with forward and reverse diffusion process. Then, we use the the denoising network to extract the multi-channel diffusion features with both visible and infrared information. Finally, we feed the multi-channel diffusion features to the multi-channel fusion module to directly generate the three-channel fused image. To retain the texture and intensity information, we propose multi-channel gradient loss and intensity loss. Along with the current evaluation metrics for measuring texture and intensity fidelity, we introduce Delta E as a new evaluation metric to quantify color fidelity. Extensive experiments indicate that our method is more effective than other state-of-the-art image fusion methods, especially in color fidelity. The source code is available at https://github.com/GeoVectorMatrix/Dif-Fusion.
AB - Color plays an important role in human visual perception, reflecting the spectrum of objects. However, the existing infrared and visible image fusion methods rarely explore how to handle multi-spectral/channel data directly and achieve high color fidelity. This paper addresses the above issue by proposing a novel method with diffusion models, termed as Dif-Fusion, to generate the distribution of the multi-channel input data, which increases the ability of multi-source information aggregation and the fidelity of colors. In specific, instead of converting multi-channel images into single-channel data in existing fusion methods, we create the multi-channel data distribution with a denoising network in a latent space with forward and reverse diffusion process. Then, we use the the denoising network to extract the multi-channel diffusion features with both visible and infrared information. Finally, we feed the multi-channel diffusion features to the multi-channel fusion module to directly generate the three-channel fused image. To retain the texture and intensity information, we propose multi-channel gradient loss and intensity loss. Along with the current evaluation metrics for measuring texture and intensity fidelity, we introduce Delta E as a new evaluation metric to quantify color fidelity. Extensive experiments indicate that our method is more effective than other state-of-the-art image fusion methods, especially in color fidelity. The source code is available at https://github.com/GeoVectorMatrix/Dif-Fusion.
KW - Image fusion
KW - color fidelity
KW - deep generative model
KW - diffusion models
KW - latent representation
KW - multimodal information
UR - https://www.scopus.com/pages/publications/85174827415
U2 - 10.1109/TIP.2023.3322046
DO - 10.1109/TIP.2023.3322046
M3 - 文章
C2 - 37843992
AN - SCOPUS:85174827415
SN - 1057-7149
VL - 32
SP - 5705
EP - 5720
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -