跳到主要导航 跳到搜索 跳到主要内容

Parallel implementation of classification algorithms based on mapreduce

  • Qing He*
  • , Fuzhen Zhuang
  • , Jincheng Li
  • , Zhongzhi Shi
  • *此作品的通讯作者
  • CAS - Institute of Computing Technology
  • University of Chinese Academy of Sciences

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Data mining has attracted extensive research for several decades. As an important task of data mining, classification plays an important role in information retrieval, web searching, CRM, etc. Most of the present classification techniques are serial, which become impractical for large dataset. The computing resource is under-utilized and the executing time is not waitable. Provided the program mode of MapReduce, we propose the parallel implementation methods of several classification algorithms, such as k-nearest neighbors, naive bayesian model and decision tree, etc. Preparatory experiments show that the proposed parallel methods can not only process large dataset, but also can be extended to execute on a cluster, which can significantly improve the efficiency.

源语言英语
主期刊名Rough Set and Knowledge Technology - 5th International Conference, RSKT 2010, Proceedings
655-662
页数8
DOI
出版状态已出版 - 2010
已对外发布
活动5th International Conference on Rough Set and Knowledge Technology, RSKT 2010 - Beijing, 中国
期限: 15 10月 201017 10月 2010

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
6401 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议5th International Conference on Rough Set and Knowledge Technology, RSKT 2010
国家/地区中国
Beijing
时期15/10/1017/10/10

指纹

探究 'Parallel implementation of classification algorithms based on mapreduce' 的科研主题。它们共同构成独一无二的指纹。

引用此