Cross-Modal Remote Sensing Image-Audio Retrieval with Adaptive Learning for Aligning Correlation - Details

author：

Huang, Jinghao (Huang, Jinghao.) ^[1] | Chen, Yaxiong (Chen, Yaxiong.) ^[2] | Xiong, Shengwu (Xiong, Shengwu.) ^[3] | Lu, Xiaoqiang (Lu, Xiaoqiang.) ^[4] (Scholars：卢孝强)

Indexed by：

Abstract：

An　important　challenge　that　existing　work　has　yet　to　address　is　the　relatively　small　differences　in　audio　representations　compared　with　the　rich　content　provided　by　remote　sensing　(RS)　images,　making　it　easy　to　overlook　certain　details　in　the　images.　This　imbalance　in　information　between　modalities　poses　a　challenge　in　maintaining　consistent　representations.　In　response　to　this　challenge,　we　propose　a　novel　cross-modal　RS　image-audio　(RSIA)　retrieval　method　called　adaptive　learning　for　aligning　correlation　(ALAC).　ALAC　integrates　region-level　learning　into　image　annotation　through　a　region-enhanced　learning　attention　(RELA)　module.　By　collaboratively　suppressing　features　at　different　region　levels,　ALAC　is　able　to　provide　a　more　comprehensive　visual　feature　representation.　In　addition,　a　novel　adaptive　knowledge　transfer　(AKT)　strategy　has　been　proposed,　which　guides　the　learning　process　of　the　frontend　network　using　aligned　feature　vectors.　This　approach　allows　the　model　to　adaptively　acquire　alignment　information　during　the　learning　process,　thereby　facilitating　better　alignment　between　the　two　modalities.　Finally,　to　better　use　mutual　information　between　different　modalities,　we　introduce　a　plug-and-play　result　rerank　module.　This　module　optimizes　the　similarity　matrix　using　retrieval　mutual　information　between　modalities　as　weights,　significantly　improving　retrieval　accuracy.　Experimental　results　on　four　RSIA　datasets　demonstrate　that　ALAC　outperforms　other　methods　in　retrieval　performance.　Compared　with　state-of-the-art　methods,　improvements　of　1.49%,　2.25%,　4.24%,　and　1.33%　were,　respectively,　achieved　by　ALAC.　The　codes　are　accessible　at　https://github.com/huangjh98/ALAC.　©　1980-2012　IEEE.

Keyword：

Feature extraction Image enhancement Image retrieval Knowledge management Learning systems Remote sensing

Community：

[ 1 ] [Huang, Jinghao]Shanghai Artificial Intelligence Laboratory, Shanghai; 200232, China
[ 2 ] [Huang, Jinghao]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan; 430070, China
[ 3 ] [Huang, Jinghao]Wuhan University of Technology, Chongqing Research Institute, Chongqing; 401122, China
[ 4 ] [Chen, Yaxiong]Shanghai Artificial Intelligence Laboratory, Shanghai; 200232, China
[ 5 ] [Chen, Yaxiong]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan; 430070, China
[ 6 ] [Chen, Yaxiong]Wuhan University of Technology, Chongqing Research Institute, Chongqing; 401122, China
[ 7 ] [Xiong, Shengwu]Shanghai Artificial Intelligence Laboratory, Shanghai; 200232, China
[ 8 ] [Xiong, Shengwu]Wuhan University of Technology, School of Computer Science and Artificial Intelligence, Wuhan; 430070, China
[ 9 ] [Xiong, Shengwu]Wuhan University of Technology, Chongqing Research Institute, Chongqing; 401122, China
[ 10 ] [Lu, Xiaoqiang]Fuzhou University, College of Physics and Information Engineering, Fuzhou; 350108, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

Integrating Height Features for Multi-scale Urban Building Type Classification from High- Resolution Remote Sensing Images
2021，Journal of Geo-Information Science
FEFN: Feature Enhancement Feedforward Network for Lightweight Object Detection in Remote Sensing Images
2024，Remote Sensing
Multi-modal feature fusion based on variational autoencoder for visual question answering
2019，2nd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2019
Prototype-Based Pseudo-Label Refinement for Semi-Supervised Hyperspectral Image Classification
2024，IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
Extracting Surface Defect Contours of Bridge Underwater Pile-pier Structures based on Lightweight Network and Transfer Learning
2024，China Journal of Highway and Transport

Source ：

IEEE Transactions on Geoscience and Remote Sensing

ISSN： 0196-2892

Year： 2024

Volume： 62

7 . 5 0 0

JCR@2023

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to