A web text extraction method based on regular expressions and text density - Details

author：

Li, Fayun (Li, Fayun.) ^[1] (Scholars：李法运)

Indexed by：

EI Scopus

Abstract：

With　the　advantages　of　some　current　web　text　extraction　algorithms,　this　paper　puts　forward　a　new　method　based　on　the　combination　of　the　regular　expressions　and　density　of　page　text,　the　method　firstly　uses　the　regular　expressions　to　clear　the　html　tags　by　the　characteristics　of　the　web　page　source　code,　and　then　extracts　the　main　text　of　page　with　the　distribution　density　of　text.　The　algorithm　is　simple　and　efficient　and　the　method　proves　to　have　higher　accuracy　for　extraction　after　tests.　©　2011　IEEE.

Keyword：

Extraction Information management Pattern matching Websites

Community：

[ 1 ] [Li, Fayun]Public Management School, Fuzhou University, Fuzhou, China

Reprint 's Address：

李法运

Email：

fayunli2002@yahoo.com.cn

Show more details

Version：

A web text extraction method based on regular expressions and text density
2011，Proceedings - 2011 4th International Conference on Information Management, Innovation Management and Industrial Engineering, ICIII 2011

Related Keywords：

Knowledge Modeling of power grid regulation based on reasoning map
2021，2021 International Conference on Energy, Power and Electrical Engineering, EPEE 2021
Syntactic and Semantic Features Based Relation Extraction in Agriculture Domain
2018，15th Web Information Systems and Applications Conference, WISA 2018
Risk Prediction Pattern Matching Method of Construction Project Management System in Big Data Era
2021，4th EAI International Conference on Advanced Hybrid Information Processing, ADHIP 2020
Directional derivative and feature line based subspace learning algorithm for classification
2016，Journal of Information Hiding and Multimedia Signal Processing

Source ：

Year： 2011

Volume： 1

Page： 287-290

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

经济与管理学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to