Skip to main navigation Skip to search Skip to main content

Predator-An experience guided configuration optimizer for Hadoop MapReduce

  • Beihang University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

MapReduce is a distributed computing programming framework which provides an effective solution to the data processing challenge. As an open-source implementation of MapReduce, Hadoop has been widely used in practice. The performance of Hadoop MapReduce heavily depends on its configuration settings, so tuning these configuration parameters could be an effective way to improve its performance. However, picking out the optimal configuration settings is not easy for the time consuming nature of MapReduce together with the high dimensional and nonlinear features of its configuration optimization. In this paper, we introduce Predator, an experience guided configuration optimizer, which does not treat the optimization problem as a pure black-box problem but utilizes useful experience learnt from Hadoop MapReduce configuration practice to assist the optimizing process. The optimizer uses job execution time estimated by a practical MapReduce cost model as the objective function, and classifies Hadoop MapReduce parameters into different groups by their different tunable levels to shrink search space. Furthermore, the optimization algorithm of the optimizer uses the idea of subspace division to prevent local optimum problem, and it could also reduce the searching time by cutting down the cost in visiting unpromising points in search space. Experiments on Hadoop clusters demonstrate the effectiveness and efficiency of the optimizer.

Original languageEnglish
Title of host publicationCloudCom 2012 - Proceedings
Subtitle of host publication2012 4th IEEE International Conference on Cloud Computing Technology and Science
PublisherIEEE Computer Society
Pages419-426
Number of pages8
ISBN (Print)9781467345095
DOIs
StatePublished - 2012
Event4th IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2012 - Taipei, Taiwan, Province of China
Duration: 3 Dec 20126 Dec 2012

Publication series

NameCloudCom 2012 - Proceedings: 2012 4th IEEE International Conference on Cloud Computing Technology and Science

Conference

Conference4th IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2012
Country/TerritoryTaiwan, Province of China
CityTaipei
Period3/12/126/12/12

Keywords

  • Configuration
  • Hadoop
  • MapReduce
  • Optimization

Fingerprint

Dive into the research topics of 'Predator-An experience guided configuration optimizer for Hadoop MapReduce'. Together they form a unique fingerprint.

Cite this