Cross-Lingual Text Classification Based on Sentence Vectors Weighting; [基于句向量加权的跨语言文本分类方法] - Details

author：

Yu, J. (Yu, J..) ^[1] | Zhao, H. (Zhao, H..) ^[2] | Wu, S. (Wu, S..) ^[3] | Xi, Y. (Xi, Y..) ^[4]

Indexed by：

EI Scopus

Abstract：

[Objective]　To　reduce　semantic　deviation　and　loss　caused　by　language　differences　and　text　feature　selection　in　the　text　classification　process　while　preserving　more　textual　information.　[Methods]　Firstly,　we　used　a　pre-trained　SBERT　model　for　sentence　representation.　Secondly,　we　calculated　the　sentence　similarity　between　texts　with　a　Sentence　Vectors　Rotator’s　Similarity　method.　We　also　applied　sentence　weighting　within　texts　to　form　vectors.　Finally,　we　combined　machine　learning　and　neural　network　classification　methods　to　achieve　cross-lingual　text　classification.　[Results]　We　conducted　experiments　on　multiple　cross-lingual　text　datasets　in　Chinese,　English,　Russian,　French,　and　Spanish,　and　the　multilingual　public　dataset　Reuters　demonstrated　that　the　proposed　method　significantly　improved　accuracy　compared　to　existing　methods.　Additionally,　recall,　precision,　and　F1　scores　also　showed　enhancements.　[Limitations]　The　study　does　not　consider　the　impact　of　sentence　position　within　the　text　on　its　weight.　[Conclusions]　The　proposed　model　could　reduce　semantic　deviation　and　loss,　thus　improving　the　performance　of　cross-lingual　text　classification.　©　2025　Chinese　Academy　of　Sciences.　All　rights　reserved.

Keyword：

Cross-Lingual Sentence Vectors Weighting Text Classification Text Similarity

Community：

[ 1 ] [Yu J.]School of Economics and Management, Fuzhou University, Fuzhou, 350108, China
[ 2 ] [Zhao H.]School of Economics and Management, Fuzhou University, Fuzhou, 350108, China
[ 3 ] [Wu S.]School of Economics and Management, Fuzhou University, Fuzhou, 350108, China
[ 4 ] [Xi Y.]School of Business Administration, South China University of Technology, Guangzhou, 510641, China

Reprint 's Address：

Email：

Show more details

Version：

Cross-Lingual Text Classification Based on Sentence Vectors Weighting
2025，Data Analysis and Knowledge Discovery

Related Keywords：

Mixed word embedding method based on knowledge graph augment for text classification
2019，17th IEEE International Conference on Parallel and Distributed Processing with Applications, 9th IEEE International Conference on Big Data and Cloud Computing, 9th IEEE International Conference on Sustainable Computing and Communications, 12th IEEE International Conference on Social Computing and Networking, ISPA/BDCloud/SustainCom/SocialCom 2019
Class Hierarchical Structure-based Text Classification
2011，International Conference on Civil Engineering and Building Materials (CEBM)
The Method of Micro-blog Article Retrieval Based on Text Similarity
2017，4th International Conference on Systems and Informatics (ICSAI)
Local Convolutional Neural Network Based Pop-Up Text Recognition and Sentiment Analysis
2024，INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS

Source ：

Data Analysis and Knowledge Discovery

ISSN： 2096-3467

Year： 2025

Issue： 2

Volume： 9

Page： 39-47

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to