JellyFish: Online performance tuning with adaptive configuration and elastic container in hadoop yarn

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

MapReduce is a popular computing framework for large-scale data processing. Practical experience shows that inappropriate configurations can result in poor performance of MapReduce jobs, however, it is challenging to pick out a suitable configuration in a short time. Also, current central resource scheduler may cause low resource utilization, and degrade the performance of the cluster. This paper proposes an online performance tuning system, JellyFish, to improve performance of MapReduce jobs and increase resource utilization in Hadoop YARN. JellyFish continually collects real-time statistics to optimize configuration and resource allocation dynamically during execution of a job. During performance tuning process, JellyFish firstly tunes configuration parameters by reducing the dimensionality of search space with a divide-and-conquer approach and using a model-based hill climbing algorithm to improve tuning efficiency; secondly, JellyFish re-schedules resources in nodes by using a novel elastic container that can expand and shrink dynamically according to resource usage, and a resource re-scheduling strategy to make full use of cluster resources. Experimental results show that JellyFish can improve performance of MapReduce jobs by an average of 24% for jobs run for the first time, and by an average of 65% for jobs run multiple times compared to default YARN.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE 21st International Conference on Parallel and Distributed Systems, ICPADS 2015
PublisherIEEE Computer Society
Pages831-836
Number of pages6
ISBN (Electronic)9780769557854
DOIs
StatePublished - 15 Jan 2016
Event21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015 - Melbourne, Australia
Duration: 14 Dec 201517 Dec 2015

Publication series

NameProceedings of the International Conference on Parallel and Distributed Systems - ICPADS
Volume2016-January
ISSN (Print)1521-9097

Conference

Conference21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015
Country/TerritoryAustralia
CityMelbourne
Period14/12/1517/12/15

Keywords

  • Distributed Computing
  • MapReduce
  • Performance Tuning
  • YARN

Fingerprint

Dive into the research topics of 'JellyFish: Online performance tuning with adaptive configuration and elastic container in hadoop yarn'. Together they form a unique fingerprint.

Cite this