Indexed by:
Abstract:
Real-world applications often require the classification of web documents under the situation of noisy data. Support vector machines (SVM) work well for classification applications because of their high generalization ability. But they are very sensitive to noisy training data, which can degrade their classification accuracy. This paper presents a new algorithm to deal with noisy training data, which combines support vector machines and K-nearest neighbor (KNN) method. Given a training set, it employs K-nearest neighbor method to remove noisy training examples. Then the remained examples are selected to train SVM classifiers for web categorization. Empirical results show that this new algorithm has strong tolerance of noise, and it can greatly reduce the influence of noisy data on the SVM classifier. © 2005 IEEE.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
Year: 2005
Volume: 2005
Page: 785-790
Language: English
Cited Count:
SCOPUS Cited Count: 13
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: