• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Ke, Xiao (Ke, Xiao.) [1] (Scholars:柯逍) | Chen, Baitao (Chen, Baitao.) [2] | Cai, Yuhang (Cai, Yuhang.) [3] | Liu, Hao (Liu, Hao.) [4] | Guo, Wenzhong (Guo, Wenzhong.) [5] (Scholars:郭文忠) | Chen, Weibin (Chen, Weibin.) [6]

Indexed by:

EI Scopus SCIE

Abstract:

There are huge differences in data distribution and feature representation of different modalities. How to flexibly and accurately retrieve data from different modalities is a challenging problem. The mainstream common subspace methods only focus on the heterogeneity gap, and use a unified method to jointly learn the common representation of different modalities, which can easily lead to the difficulty of multi-modal unified fitting. In this work, we innovatively propose the concept of multi-modal information density discrepancy, and propose a modality-specific adaptive scaling method incorporating prior knowledge, which can adaptively learn the most suitable network for different modalities. Secondly, for the problem of efficient semantic fusion and interference features, we propose a multi-level modal feature attention mechanism, which realizes the efficient fusion of text semantics through attention mechanism, explicitly captures and shields the interference features from multiple scales. In addition, to address the bottleneck of cross-modal retrieval task caused by the insufficient quality of multimodal common subspace and the defects of Transformer structure, this paper proposes a cross-level interaction injection mechanism to fuse multi-level patch interactions without affecting the pre-trained model to construct higher quality latent representation spaces and multimodal common subspaces. Comprehensive experimental results on four widely used cross-modal retrieval datasets show the proposed MASAN achieves the state-of-the-art results and significantly outperforms other existing methods.

Keyword:

Attention mechanism Common representation learning Cross-modal retrieval

Community:

  • [ 1 ] [Ke, Xiao]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
  • [ 2 ] [Chen, Baitao]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
  • [ 3 ] [Cai, Yuhang]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
  • [ 4 ] [Liu, Hao]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
  • [ 5 ] [Guo, Wenzhong]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
  • [ 6 ] [Chen, Weibin]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China
  • [ 7 ] [Ke, Xiao]Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China
  • [ 8 ] [Chen, Baitao]Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China
  • [ 9 ] [Cai, Yuhang]Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China
  • [ 10 ] [Liu, Hao]Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China
  • [ 11 ] [Guo, Wenzhong]Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China
  • [ 12 ] [Chen, Weibin]Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China

Reprint 's Address:

  • [Guo, Wenzhong]Fuzhou Univ, Coll Comp & Data Sci, Fujian Key Lab Network Comp & Intelligent Informat, Fuzhou 350116, Peoples R China;;[Guo, Wenzhong]Fuzhou Univ, Key Lab Spatial Data Min & Informat Sharing, Minist Educ, Fuzhou 350116, Peoples R China;;

Show more details

Related Keywords:

Related Article:

Source :

NEUROCOMPUTING

ISSN: 0925-2312

Year: 2024

Volume: 612

5 . 5 0 0

JCR@2023

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Online/Total:59/10064172
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1