Parallel implementation of classification algorithms based on mapreduce

  • Qing He*
  • , Fuzhen Zhuang
  • , Jincheng Li
  • , Zhongzhi Shi
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Data mining has attracted extensive research for several decades. As an important task of data mining, classification plays an important role in information retrieval, web searching, CRM, etc. Most of the present classification techniques are serial, which become impractical for large dataset. The computing resource is under-utilized and the executing time is not waitable. Provided the program mode of MapReduce, we propose the parallel implementation methods of several classification algorithms, such as k-nearest neighbors, naive bayesian model and decision tree, etc. Preparatory experiments show that the proposed parallel methods can not only process large dataset, but also can be extended to execute on a cluster, which can significantly improve the efficiency.

Original languageEnglish
Title of host publicationRough Set and Knowledge Technology - 5th International Conference, RSKT 2010, Proceedings
Pages655-662
Number of pages8
DOIs
StatePublished - 2010
Externally publishedYes
Event5th International Conference on Rough Set and Knowledge Technology, RSKT 2010 - Beijing, China
Duration: 15 Oct 201017 Oct 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6401 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference5th International Conference on Rough Set and Knowledge Technology, RSKT 2010
Country/TerritoryChina
CityBeijing
Period15/10/1017/10/10

Keywords

  • Classification
  • Data Mining
  • Large Dataset
  • MapReduce
  • Parallel Implementation

Fingerprint

Dive into the research topics of 'Parallel implementation of classification algorithms based on mapreduce'. Together they form a unique fingerprint.

Cite this