跳到主要导航 跳到搜索 跳到主要内容

Bootstrap sampling based data cleaning and maximum entropy SVMs for large datasets

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Support Vector Machines (SVMs) is a popular machine learning algorithm based on Statistical Learning Theory (SLT). However, traditional solutions suffer from O(n^{2}) time complexity. In this paper, a novel two-stage informative pattern abstraction algorithm is proposed. The first stage of the algorithm is data cleaning based on bootstrap sampling. A bundle of weak SVM classifiers are trained based on the sampled small datasets. Training data correctly classified by all the weak classifiers are cleaned. In the second stage, to further improve performance of final classifier and reduce training time, two novel informative pattern extraction algorithms based on entropy maximization SVMs are proposed. Empirical studies show our approach is effective in reducing size of training datasets and the computational cost, outperforming the state-of-the-art SVM training algorithms PEGASOS, RSVM and LIBLINEAR SVM with comparable classification accuracy.

源语言英语
主期刊名Proceedings - 2012 IEEE 24th International Conference on Tools with Artificial Intelligence, ICTAI 2012
1151-1156
页数6
DOI
出版状态已出版 - 2012
活动2012 IEEE 24th International Conference on Tools with Artificial Intelligence, ICTAI 2012 - Athens, 希腊
期限: 7 11月 20129 11月 2012

出版系列

姓名Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
1
ISSN(印刷版)1082-3409

会议

会议2012 IEEE 24th International Conference on Tools with Artificial Intelligence, ICTAI 2012
国家/地区希腊
Athens
时期7/11/129/11/12

指纹

探究 'Bootstrap sampling based data cleaning and maximum entropy SVMs for large datasets' 的科研主题。它们共同构成独一无二的指纹。

引用此