跳到主要导航 跳到搜索 跳到主要内容

SPM: Modeling Spark Task Execution Time from the Sub-stage Perspective

  • Beihang University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Tasks are the basic unit of Spark application scheduling, and its execution is affected by various configurations of Spark cluster. Therefore, the prediction of task execution time is a challenging job. In this paper, we analyze the features of task execution procedure on different stages, and propose the method of prediction of each sub-stage execution time. Moreover, the correlative time overheads of GC and shuffle spill are analyzed in detail. As a result, we propose SPM, a task-level execution time prediction model. SPM can be used to predict the task execution time of each stage according to the input data size and configuration of parallelism. We further apply SPM to the Spark network emulation tool SNemu, which can determine the start time of each shuffle procedure for emulation effectively. Experimental results show that the prediction method can achieve high accuracy in a variety of Spark benchmarks on Hibench.

源语言英语
主期刊名Algorithms and Architectures for Parallel Processing - 19th International Conference, ICA3PP 2019, Proceedings
编辑Sheng Wen, Albert Zomaya, Laurence T. Yang
出版商Springer
3-10
页数8
ISBN(印刷版)9783030389604
DOI
出版状态已出版 - 2020
活动19th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2019 - Melbourne, 澳大利亚
期限: 9 12月 201911 12月 2019

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
11945 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议19th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2019
国家/地区澳大利亚
Melbourne
时期9/12/1911/12/19

指纹

探究 'SPM: Modeling Spark Task Execution Time from the Sub-stage Perspective' 的科研主题。它们共同构成独一无二的指纹。

引用此