基于XSLT的PDF信息抽取技术的研究 - Details

author：

宋艳娟 (宋艳娟.) ^[1] | 李金铭 (李金铭.) ^[2] | 陈振标 (陈振标.) ^[3]

Indexed by：

CQVIP

Abstract：

以XML作为信息表现模型,以XSLT作为信息抽取规则,设计并实现了一套面向科技论文的PDF文档的信息抽取系统。首先将PDF源文档转换为一种中间XML文档,然后利用文本特征、位置特征以及显示特征对中间XML文档进行基于XSLT规则的信息抽取。测试结果表明,系统的抽取效果良好,并具有较强的扩展性。

Keyword：

PDF XML XSLT 信息抽取

Community：

Reprint 's Address：

Email：

Show more details

Related Keywords：

Source ：

计算机与数字工程

ISSN： 1672-9722

Year： 2008

Issue： 5

Volume： 36

Page： 156-159

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count： -1

30 Days PV： 0

Affiliated Colleges：

Get Fulltext

Library Discovery Baidu Scholar Search CQVIP