跳到主要导航 跳到搜索 跳到主要内容

Hastening Stream Offloading of Inference via Multi-Exit DNNs in Mobile Edge Computing

  • Zhicheng Liu
  • , Jinduo Song
  • , Chao Qiu
  • , Xiaofei Wang*
  • , Xu Chen
  • , Qiang He
  • , Hao Sheng
  • *此作品的通讯作者
  • Tianjin University
  • Sun Yat-Sen University
  • Swinburne University of Technology

科研成果: 期刊稿件文章同行评审

摘要

As the primary driver of intelligent mobile applications, deep neural networks (DNNs) have gradually deployed to millions of mobile devices, producing massive latency-sensitive and computation-intensive tasks daily. Mobile edge computing facilitates the deployment of computing resources at the edge, which enables fine-grained offloading of DNN inference tasks from mobile devices to edge nodes. However, most existing studies have not systematically considered three crucial performance aspects: scheduling multiple streams of DNN inference tasks, leveraging multi-exit models to hasten task processing, and partitioning inference models for partial offloading. To this end, this paper proposes an adaptive inference framework in mobile edge computing, which can dynamically select the exit point and partition point for multiple inference task streams. We design a dynamic programming algorithm to obtain an efficient solution under the ideal condition that task arrival information is known. Further, we design a learning-based algorithm for online scheduling, whose training efficiency is improved based on historical experience initialization and priority experience replay. Experimental results show that compared with the Greedy algorithm, the online algorithm improves the performance on two environmental parameters by an average of 5.9% and 32%, respectively.

源语言英语
页(从-至)535-548
页数14
期刊IEEE Transactions on Mobile Computing
23
1
DOI
出版状态已出版 - 1 1月 2024

指纹

探究 'Hastening Stream Offloading of Inference via Multi-Exit DNNs in Mobile Edge Computing' 的科研主题。它们共同构成独一无二的指纹。

引用此