Visual Contextual Semantic Reasoning for Cross-Modal Drone Image-Text Retrieval - Details

author：

Huang, Jinghao (Huang, Jinghao.) ^[1] | Chen, Yaxiong (Chen, Yaxiong.) ^[2] | Xiong, Shengwu (Xiong, Shengwu.) ^[3] | Lu, Xiaoqiang (Lu, Xiaoqiang.) ^[4] (Scholars：卢孝强)

Indexed by：

EI Scopus SCIE

Abstract：

The　cross-modal　drone　image-text　(DIT)　retrieval　task　involves　using　either　text　or　drone　images　as　queries　to　retrieve　relevant　drone　images　or　corresponding　text.　The　primary　challenge　stems　from　the　diverse　and　intricate　nature　of　drone　images,　making　effective　alignment　between　image　and　text　challenging.　In　response,　we　propose　an　innovative　approach　called　visual　contextual　semantic　reasoning　(VCSR),　aimed　at　precisely　aligning　information　across　different　modalities.　VCSR　employs　textual　cues　to　guide　rich　semantic　reasoning　within　the　visual　context,　reducing　redundancy　in　visual　information.　Furthermore,　the　method　captures　drone　image　information　relevant　to　the　text,　revealing　subtle　correspondences　between　drone　image　regions　and　textual　content.　To　enhance　visual　semantic　learning,　context　region　learning　(CRL)　term　and　consistency　semantic　alignment　(CSA)　terms　are　introduced　for　stronger　guidance,　further　intensifying　the　cross-modal　interaction　between　textual　and　visual　data,　resulting　in　more　robust　feature　representation.　Extensive　experiments　conducted　on　two　self-constructed　DIT　datasets　demonstrate　that　VCSR　outperforms　alternative　methods　in　terms　of　DIT　retrieval　performance.　The　codes　are　accessible　at　https://github.com/huangjh98/VCSR.

Keyword：

Cross-modal drone retrieval semantic alignment visual contextual semantic reasoning (VCSR)

Community：

[ 1 ] [Huang, Jinghao]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[ 2 ] [Chen, Yaxiong]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[ 3 ] [Xiong, Shengwu]Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Peoples R China
[ 4 ] [Huang, Jinghao]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[ 5 ] [Chen, Yaxiong]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[ 6 ] [Xiong, Shengwu]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China
[ 7 ] [Huang, Jinghao]Wuhan Univ Technol, Chongqing Res Inst, Chongqing 401122, Peoples R China
[ 8 ] [Chen, Yaxiong]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
[ 9 ] [Xiong, Shengwu]Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
[ 10 ] [Chen, Yaxiong]Wuhan Huaxia Inst Technol, Sch Informat Engn, Wuhan 430223, Peoples R China
[ 11 ] [Xiong, Shengwu]Wuhan Huaxia Inst Technol, Sch Informat Engn, Wuhan 430223, Peoples R China
[ 12 ] [Chen, Yaxiong]Qiongtai Normal Univ, Sch Informat Sci & Technol, Haikou 571127, Peoples R China
[ 13 ] [Xiong, Shengwu]Qiongtai Normal Univ, Sch Informat Sci & Technol, Haikou 571127, Peoples R China
[ 14 ] [Lu, Xiaoqiang]Fuzhou Univ, Coll Phys & Informat Engn, Fuzhou, Peoples R China

Reprint 's Address：

[Chen, Yaxiong]Wuhan Univ Technol, Sanya Sci & Educ Innovat Pk, Sanya 572000, Peoples R China;;

Email：

593544199@qq.com

Show more details

Version：

Visual Contextual Semantic Reasoning for Cross-Modal Drone Image-Text Retrieval
2024，IEEE Transactions on Geoscience and Remote Sensing
Visual Contextual Semantic Reasoning for Cross-Modal Drone Image-Text Retrieval
2024，IEEE Transactions on Geoscience and Remote Sensing

Related Keywords：

A Spatial and Semantic Alignment Fusion Network for SeaLand Port Segmentation
2025，IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
Context-Aware Local-Global Semantic Alignment for Remote Sensing Image-Text Retrieval
2025，IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
Relevance-Guided Adaptive Learning for Remote Sensing Image–Text Retrieval
2025，IEEE Transactions on Geoscience and Remote Sensing
Prototype rectification for zero-shot learning
2024，PATTERN RECOGNITION

Source ：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING

ISSN： 0196-2892

Year： 2024

Volume： 62

7 . 5 0 0

JCR@2023

CAS Journal Grade：1

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 3

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 4

Affiliated Colleges：

物理与信息工程学院、微电子学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to