Home>Results

  • Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

[期刊论文]

Self-Supervision Interactive Alignment for Remote Sensing Image-Audio Retrieval

Share
Edit Delete 报错

author:

Huang, Jinghao (Huang, Jinghao.) [1] | Chen, Yaxiong (Chen, Yaxiong.) [2] | Xiong, Shengwu (Xiong, Shengwu.) [3] | Unfold

Indexed by:

EI Scopus SCIE

Abstract:

Cross-modal remote sensing image-audio (RSIA) retrieval aims to use audio or remote sensing images (RSIs) as queries to retrieve relevant RSIs or corresponding audios. Although many approaches leverage labeled samples to achieve good performance, the performance cost of labeled samples is high, because cross-modal remote sensing (RS) labeled samples usually require huge labor resources. Therefore, unsupervised cross-modal learning is very important in real-world applications. In this article, we propose a novel unsupervised cross-modal RSIA retrieval approach, named self-supervision interactive alignment (SSIA), which can take advantage of large amounts of unlabeled samples to learn the salient information, cross-modal alignment, and the similarity between RSIs and audios. Since self-supervised learning lacks the supervision of label information, we leverage the similarity between the input RSI information and audio information as the supervision information. Besides, to perform cross-modal alignment, a novel interactive alignment (IA) module is designed to explore fine correspondence relation for RSIs and audios. Moreover, we design an audio-guided image de-redundant module to reduce the redundant information of visual information, which can capture salient information of RSIs. Extensive experiments on four widely used RSIA datasets testify that the SSIA performance gains better RSIA retrieval performance than other compared approaches.

Keyword:

Cross-modal remote sensing (RS) retrieval interactive alignment (IA) Remote sensing self-supervised learning Semantics similarity preservation Task analysis Technological innovation Transformers Unsupervised learning Visualization

Community:

  • [ 1 ] [Huang, Jinghao]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
  • [ 2 ] [Chen, Yaxiong]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
  • [ 3 ] [Xiong, Shengwu]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
  • [ 4 ] [Huang, Jinghao]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
  • [ 5 ] [Chen, Yaxiong]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
  • [ 6 ] [Xiong, Shengwu]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
  • [ 7 ] [Chen, Yaxiong]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
  • [ 8 ] [Xiong, Shengwu]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
  • [ 9 ] [Chen, Yaxiong]Wuhan Univ Technol, Chongqing Res Inst, Chongqing 401122, Peoples R China
  • [ 10 ] [Xiong, Shengwu]Wuhan Univ Technol, Chongqing Res Inst, Chongqing 401122, Peoples R China
  • [ 11 ] [Lu, Xiaoqiang]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou 350108, Peoples R China

Reprint 's Address:

  • [Xiong, Shengwu]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China

Show more details

Version:

Source :

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

ISSN: 0196-2892

Year: 2023

Volume: 61

7 . 5

JCR@2023

7 . 5 0 0

JCR@2023

ESI Discipline: GEOSCIENCES;

ESI HC Threshold:26

JCR Journal Grade:1

CAS Journal Grade:1

Cited Count:

WoS CC Cited Count: 2

SCOPUS Cited Count: 4

30 Days PV: 0

Online/Total:117/10031315
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1