TY - GEN
T1 - JellyFish
T2 - 21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015
AU - Ding, Xiaoan
AU - Liu, Yi
AU - Qian, Depei
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/15
Y1 - 2016/1/15
N2 - MapReduce is a popular computing framework for large-scale data processing. Practical experience shows that inappropriate configurations can result in poor performance of MapReduce jobs, however, it is challenging to pick out a suitable configuration in a short time. Also, current central resource scheduler may cause low resource utilization, and degrade the performance of the cluster. This paper proposes an online performance tuning system, JellyFish, to improve performance of MapReduce jobs and increase resource utilization in Hadoop YARN. JellyFish continually collects real-time statistics to optimize configuration and resource allocation dynamically during execution of a job. During performance tuning process, JellyFish firstly tunes configuration parameters by reducing the dimensionality of search space with a divide-and-conquer approach and using a model-based hill climbing algorithm to improve tuning efficiency; secondly, JellyFish re-schedules resources in nodes by using a novel elastic container that can expand and shrink dynamically according to resource usage, and a resource re-scheduling strategy to make full use of cluster resources. Experimental results show that JellyFish can improve performance of MapReduce jobs by an average of 24% for jobs run for the first time, and by an average of 65% for jobs run multiple times compared to default YARN.
AB - MapReduce is a popular computing framework for large-scale data processing. Practical experience shows that inappropriate configurations can result in poor performance of MapReduce jobs, however, it is challenging to pick out a suitable configuration in a short time. Also, current central resource scheduler may cause low resource utilization, and degrade the performance of the cluster. This paper proposes an online performance tuning system, JellyFish, to improve performance of MapReduce jobs and increase resource utilization in Hadoop YARN. JellyFish continually collects real-time statistics to optimize configuration and resource allocation dynamically during execution of a job. During performance tuning process, JellyFish firstly tunes configuration parameters by reducing the dimensionality of search space with a divide-and-conquer approach and using a model-based hill climbing algorithm to improve tuning efficiency; secondly, JellyFish re-schedules resources in nodes by using a novel elastic container that can expand and shrink dynamically according to resource usage, and a resource re-scheduling strategy to make full use of cluster resources. Experimental results show that JellyFish can improve performance of MapReduce jobs by an average of 24% for jobs run for the first time, and by an average of 65% for jobs run multiple times compared to default YARN.
KW - Distributed Computing
KW - MapReduce
KW - Performance Tuning
KW - YARN
UR - https://www.scopus.com/pages/publications/84964691891
U2 - 10.1109/ICPADS.2015.112
DO - 10.1109/ICPADS.2015.112
M3 - 会议稿件
AN - SCOPUS:84964691891
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 831
EP - 836
BT - Proceedings - 2015 IEEE 21st International Conference on Parallel and Distributed Systems, ICPADS 2015
PB - IEEE Computer Society
Y2 - 14 December 2015 through 17 December 2015
ER -