Self-Supervision Interactive Alignment for Remote Sensing Image-Audio Retrieval - Details

author：

Huang, Jinghao (Huang, Jinghao.) ^[1] | Chen, Yaxiong (Chen, Yaxiong.) ^[2] | Xiong, Shengwu (Xiong, Shengwu.) ^[3] | Lu, Xiaoqiang (Lu, Xiaoqiang.) ^[4]

Indexed by：

EI Scopus SCIE

Abstract：

Cross-modal　remote　sensing　image-audio　(RSIA)　retrieval　aims　to　use　audio　or　remote　sensing　images　(RSIs)　as　queries　to　retrieve　relevant　RSIs　or　corresponding　audios.　Although　many　approaches　leverage　labeled　samples　to　achieve　good　performance,　the　performance　cost　of　labeled　samples　is　high,　because　cross-modal　remote　sensing　(RS)　labeled　samples　usually　require　huge　labor　resources.　Therefore,　unsupervised　cross-modal　learning　is　very　important　in　real-world　applications.　In　this　article,　we　propose　a　novel　unsupervised　cross-modal　RSIA　retrieval　approach,　named　self-supervision　interactive　alignment　(SSIA),　which　can　take　advantage　of　large　amounts　of　unlabeled　samples　to　learn　the　salient　information,　cross-modal　alignment,　and　the　similarity　between　RSIs　and　audios.　Since　self-supervised　learning　lacks　the　supervision　of　label　information,　we　leverage　the　similarity　between　the　input　RSI　information　and　audio　information　as　the　supervision　information.　Besides,　to　perform　cross-modal　alignment,　a　novel　interactive　alignment　(IA)　module　is　designed　to　explore　fine　correspondence　relation　for　RSIs　and　audios.　Moreover,　we　design　an　audio-guided　image　de-redundant　module　to　reduce　the　redundant　information　of　visual　information,　which　can　capture　salient　information　of　RSIs.　Extensive　experiments　on　four　widely　used　RSIA　datasets　testify　that　the　SSIA　performance　gains　better　RSIA　retrieval　performance　than　other　compared　approaches.

Keyword：

Cross-modal remote sensing (RS) retrieval interactive alignment (IA) Remote sensing self-supervised learning Semantics similarity preservation Task analysis Technological innovation Transformers Unsupervised learning Visualization

Community：

[ 1 ] [Huang, Jinghao]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[ 2 ] [Chen, Yaxiong]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[ 3 ] [Xiong, Shengwu]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[ 4 ] [Huang, Jinghao]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[ 5 ] [Chen, Yaxiong]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[ 6 ] [Xiong, Shengwu]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[ 7 ] [Chen, Yaxiong]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
[ 8 ] [Xiong, Shengwu]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
[ 9 ] [Chen, Yaxiong]Wuhan Univ Technol, Chongqing Res Inst, Chongqing 401122, Peoples R China
[ 10 ] [Xiong, Shengwu]Wuhan Univ Technol, Chongqing Res Inst, Chongqing 401122, Peoples R China
[ 11 ] [Lu, Xiaoqiang]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou 350108, Peoples R China

Reprint 's Address：

[Xiong, Shengwu]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China

Email：

xiongsw@whut.edu.cn

Show more details

Version：

Self-Supervision Interactive Alignment for Remote Sensing Image-Audio Retrieval
2023，IEEE Transactions on Geoscience and Remote Sensing
Self-Supervision Interactive Alignment for Remote Sensing Image-Audio Retrieval
2023，IEEE Transactions on Geoscience and Remote Sensing

Related Keywords：

Fine Aligned Discriminative Hashing for Remote Sensing Image-Audio Retrieval
2023，IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
Scale-Aware Adaptive Refinement and Cross-Interaction for Remote Sensing Audio-Visual Cross-Modal Retrieval
2024，IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
Unsupervised Domain Adaptation on Point Clouds via High-Order Geometric Structure Modeling
2024，IEEE Transactions on Artificial Intelligence
Context-Aware Local-Global Semantic Alignment for Remote Sensing Image-Text Retrieval
2025，IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

Source ：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

ISSN： 0196-2892

Year： 2023

Volume： 61

7 . 5

JCR@2023

7 . 5 0 0

JCR@2023

ESI Discipline： GEOSCIENCES;

ESI HC Threshold：26

JCR Journal Grade：1

CAS Journal Grade：1

Cited Count：

WoS CC Cited Count： 2

SCOPUS Cited Count： 4

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

物理与信息工程学院、微电子学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to