M-DRL: Deep Reinforcement Learning Based Coflow Traffic Scheduler with MLFQ Threshold Adaption

  • Tianba Chen*
  • , Wei Li
  • , Yu Kang Sun
  • , Yunchun Li
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The coflow scheduling in data-parallel clusters can improve application-level communication performance. The existing coflow scheduling method without prior knowledge usually uses multi-level feedback queue (MLFQ) with fixed threshold parameters, which is insensitive to coflow traffic characteristics. Manual adjustment of the threshold parameters for different application scenarios often has long optimization period and is coarse in optimization granularity. We propose M-DRL, a deep reinforcement learning based coflow traffic scheduler by dynamically setting thresholds of MLFQ to adapt to the coflow traffic characteristics, and reduces the average coflow completion time. Trace-driven simulations on the public dataset show that coflow communication stages using M-DRL complete 2.08x(6.48x) and 1.36x(1.25x) faster on average coflow completion time (95-th percentile) in comparison to per-flow fairness and Aalo, and is comparable to SEBF with prior knowledge.

Original languageEnglish
Pages (from-to)646-657
Number of pages12
JournalInternational Journal of Parallel Programming
Volume49
Issue number5
DOIs
StatePublished - Oct 2021

Keywords

  • Coflow
  • Datacenter network
  • Deep reinforcement learning

Fingerprint

Dive into the research topics of 'M-DRL: Deep Reinforcement Learning Based Coflow Traffic Scheduler with MLFQ Threshold Adaption'. Together they form a unique fingerprint.

Cite this