Indexed by:
Abstract:
The optimal structure of theme set can be automatically learned from the data with Hierarchical Dirichlet Process (HDP) topic model. However, the set of topics can not meet the semantic requirement. And in some theme models with labels it is difficult to set the parameters. Therefore, based on the known semantic labels and the certitude degree of labels, a semi-supervised labeled HDP topic model (SLHDP) and the accuracy evaluation index of random cluster are proposed in this paper. Higher weight is given by the known semantic labels. Combined with the property of the finite space being divided infinitely in Dirichlet process, the model is built via Chinese restaurant process. The experimental results on several Chinese and English datasets show that SLHDP model makes the topic set more reasonable in the text classification of large scale datasets. © 2017, Science Press. All right reserved.
Keyword:
Reprint 's Address:
Email:
Source :
Pattern Recognition and Artificial Intelligence
ISSN: 1003-6059
Year: 2017
Issue: 12
Volume: 30
Page: 1138-1148
Cited Count:
SCOPUS Cited Count: 1
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: