• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Lu, F. (Lu, F..) [1] | Bai, Q. (Bai, Q..) [2]

Indexed by:

Scopus

Abstract:

This paper studies a special case of semi-supervised text categorization. We want to build a text classifier with only a set P of labeled positive documents from one class (called positive class) and a set U of a large number of unlabeled documents from both positive class and other diverse classes (called negative class). This kind of semi-supervised text classification is called positive and unlabeled learning (PU-Learning). Although there are some effective methods for PU-Learning, they do not perform very well when the labeled positive documents are very few. In this paper, we propose a refined method to do the PU-Learning with the known technique combining Rocchio and K-means algorithm. Considering the set P may be very small (≤5%), not only we extract more reliable negative documents from U but also enlarge the size of P with extracting some most reliable positive documents from U. Our experimental results show that the refined method can perform better when the set P is very small. ©2010 IEEE.

Keyword:

Cluster; Semi-supervised learning; Text categorization

Community:

  • [ 1 ] [Lu, F.]College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
  • [ 2 ] [Bai, Q.]College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China

Reprint 's Address:

  • [Bai, Q.]College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China

Email:

Show more details

Related Keywords:

Related Article:

Source :

Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010

Year: 2010

Volume: 7

Page: 3075-3079

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count: 8

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 0

Affiliated Colleges:

Online/Total:246/10064592
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1