Indexed by:
Abstract:
In the end-to-end text recognition of complex natural scenes, because text and background are difficult to distinguish, the location information detected by text and the semantic information recognized do not match, and the correlation between detection and recognition cannot be effectively utilized. In response to this problem, this paper proposes a multi-party synergetic information with dual-domain awareness text spotting (MSIDA). By enhancing text region features and edge textures, the synergies between text detection and recognition features are utilized to improve end-to-end text recognition performance. Firstly, a dual-domain awareness (DDA) module integrating text space and direction information is designed to enhance the visual feature information of text instances. Secondly, a multi-party explicit information synergy (MEIS) is proposed to extract explicit information from coding features and generate candidate text instances by matching and allocating the position, classification and character multi-party information used for detection and recognition. Finally, cooperative features guide learnable query sequences through decoders to obtain text detection and recognition results. Compared to the latest decoder with explicit points solo (DeepSolo) method, on the Total-Text, ICDAR 2015 and CTW1500 datasets, the accuracy of MSIDA improved respectively by 0.8%, 0.8% and 0.4%. The code and datasets are available at https://github.com/msida2024/MSIDA.git. © 2025 Chinese Institute of Electronics. All rights reserved.
Keyword:
Reprint 's Address:
Email:
Source :
Acta Electronica Sinica
ISSN: 0372-2112
Year: 2025
Issue: 3
Volume: 53
Page: 974-985
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: