End-to-End Scene Text Spotting Under Dual Domain Awareness Based on Multi-Party Synergetic Explicit Information - Details

author：

Chen, Ping-Ping (Chen, Ping-Ping.) ^[1] | Lin, Hu (Lin, Hu.) ^[2] | Chen, Hong-Hui (Chen, Hong-Hui.) ^[3] | Xie, Zhao-Peng (Xie, Zhao-Peng.) ^[4]

Indexed by：

Abstract：

In　the　end-to-end　text　recognition　of　complex　natural　scenes,　because　text　and　background　are　difficult　to　distinguish,　the　location　information　detected　by　text　and　the　semantic　information　recognized　do　not　match,　and　the　correlation　between　detection　and　recognition　cannot　be　effectively　utilized.　In　response　to　this　problem,　this　paper　proposes　a　multi-party　synergetic　information　with　dual-domain　awareness　text　spotting　(MSIDA).　By　enhancing　text　region　features　and　edge　textures,　the　synergies　between　text　detection　and　recognition　features　are　utilized　to　improve　end-to-end　text　recognition　performance.　Firstly,　a　dual-domain　awareness　(DDA)　module　integrating　text　space　and　direction　information　is　designed　to　enhance　the　visual　feature　information　of　text　instances.　Secondly,　a　multi-party　explicit　information　synergy　(MEIS)　is　proposed　to　extract　explicit　information　from　coding　features　and　generate　candidate　text　instances　by　matching　and　allocating　the　position,　classification　and　character　multi-party　information　used　for　detection　and　recognition.　Finally,　cooperative　features　guide　learnable　query　sequences　through　decoders　to　obtain　text　detection　and　recognition　results.　Compared　to　the　latest　decoder　with　explicit　points　solo　(DeepSolo)　method,　on　the　Total-Text,　ICDAR　2015　and　CTW1500　datasets,　the　accuracy　of　MSIDA　improved　respectively　by　0.8%,　0.8%　and　0.4%.　The　code　and　datasets　are　available　at　https://github.com/msida2024/MSIDA.git.　©　2025　Chinese　Institute　of　Electronics.　All　rights　reserved.

Keyword：

Character recognition Classification (of information) Computer vision Correlation detectors Decoding Feature extraction Semantics Text processing Textures

Community：

[ 1 ] [Chen, Ping-Ping]College of Physics and Information Engineering, Fuzhou University, Fujian, Fuzhou; 350108, China
[ 2 ] [Lin, Hu]College of Physics and Information Engineering, Fuzhou University, Fujian, Fuzhou; 350108, China
[ 3 ] [Chen, Hong-Hui]College of Physics and Information Engineering, Fuzhou University, Fujian, Fuzhou; 350108, China
[ 4 ] [Xie, Zhao-Peng]College of Physics and Information Engineering, Fuzhou University, Fujian, Fuzhou; 350108, China

Reprint 's Address：

Email：

Show more details

Related Keywords：

Massive-scale complicated human action recognition: Theory and applications
2021，Future Generation Computer Systems
Character Recognition Based on k-Nearest Neighbor, Simple Logistic Regression, and Random Forest
2024，International Conference on Intelligent Manufacturing and Robotics, ICIMR 2023
An Ultra-Fast Automatic License Plate Recognition Approach for Unconstrained Scenarios
2023，IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
A Chinese Knowledge Graph Q&A System Based on Dense Relationship Retrieval
2023，3rd IEEE International Conference on Software Engineering and Artificial Intelligence, SEAI 2023

Source ：

Acta Electronica Sinica

ISSN： 0372-2112

Year： 2025

Issue： 3

Volume： 53

Page： 974-985

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to