Indexed by:
Abstract:
In this paper, we propose HSENet, a hierarchical semantic-enriched network capable of generating high-quality fused images with robust global semantic consistency and excellent local detail representation. The core innovation of HSENet lies in its hierarchical enrichment of semantic information through semantic gathering, distribution, and injection. Specifically, the network begins by balancing global information exchange via multi-scale feature aggregation and redistribution while dynamically bridging fusion and segmentation tasks. Following this, a progressive semantic dense injection strategy is introduced, employing dense connections to first inject global semantics into highly consistent infrared features and then propagate the semantic-infrared hybrid features to visible features. This approach effectively enhances semantic representation while minimizing high-frequency information loss. Furthermore, HSENet includes two types of feature fusion modules, to leverage cross-modal attention for more comprehensive feature fusion and utilize semantic features as a third input to further enhance the semantic representation for image fusion. These modules achieve robust and flexible feature fusion in complex scenarios by dynamically balancing global semantic consistency and fine-grained local detail representation. Our approach excels in visual perception tasks while fully preserving the texture features from the source modalities. The comparison experiments of image fusion and semantic segmentation demonstrate the superiority of HSENet in visual quality and semantic preservation. The code is available at https://github.com/Lxyklmyt/HSENet.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
PATTERN RECOGNITION
ISSN: 0031-3203
Year: 2026
Volume: 170
7 . 5 0 0
JCR@2023
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: