Indexed by:
Abstract:
The fusion of multi-modal images to create an image that preserves the unique features of each modality as well as the features shared across modalities is a challenging task, particularly in the context of infrared (IR)-visible image fusion. In addition, the presence of polarization and IR radiation information in images obtained from IR polarization sensors further complicates the multi-modal image-fusion process. This study proposes a fusion network designed to overcome the challenges associated with the integration of low-resolution IR, IR polarization, and high-resolution visible (VIS) images. By introducing cross attention modules and a multi-stage fusion approach, the network can effectively extract and fuse features from different modalities, fully expressing the diversity of the images. This network learns end-to-end mapping from sourced to fused images using a loss function, eliminating the need for ground-truth images for fusion. Experimental results using public datasets and remote-sensing field-test data demonstrate that the proposed methodology achieves commendable results in qualitative and quantitative evaluations, with gradient based fusion performance QAB/F, mutual information (MI), and QCB values higher than the second-best values by 0.20, 0.94, and 0.04, respectively. This study provides a comprehensive representation of target scene information that results in enhanced image quality and improved object identification capabilities. In addition, outdoor and VIS image datasets are produced, providing a data foundation and reference for future research in related fields. © 2024 Elsevier B.V.
Keyword:
Reprint 's Address:
Email:
Source :
Infrared Physics and Technology
ISSN: 1350-4495
Year: 2024
Volume: 141
3 . 1 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count: 1
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: