跳到主要导航 跳到搜索 跳到主要内容

PRoof: A Comprehensive Hierarchical Profiling Framework for Deep Neural Networks with Roofline Analysis

  • Beihang University
  • SenseTime Group Limited

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

The increasing diversity of deep neural network (DNN) models and hardware platforms necessitates effective model profiling for high-performance inference deployment. Current DNN profiling tools suffer from either limited optimization insights due to the missing correlation between high-level DNN layer design and low-level hardware performance metrics, or prohibitive profiling overhead due to the large amount of performance measurement through hardware performance counters. Meanwhile, the roofline model has been widely used in the high-performance computing (HPC) domain for identifying performance bottlenecks and guiding optimizations. However, it lacks hierarchical (e.g., kernel/operator/layer), fine-grained, multi-platform support for profiling DNN models. To overcome the above limitations, we propose PRoof, a versatile DNN profiling framework, that can effectively attribute the hardware performance metrics back to the model design. In addition, PRoof does not require massive hardware profiling and thus mitigates the large profiling overhead. Specifically, our approach correlates the profiled result of each layer to their conceptual layer design by effectively handling layer fusion. Our approach also provides an analytical model to predict the floating-point operations (FLOP) and memory accesses of DNN models without massive profiling. We demonstrate the effectiveness of PRoof with representative DNN models across a wide range of hardware platforms. Derived from PRoof's profiling results, we obtain several insights to provide useful guidance for model design and hardware tuning.

源语言英语
主期刊名53rd International Conference on Parallel Processing, ICPP 2024 - Main Conference Proceedings
出版商Association for Computing Machinery
822-832
页数11
ISBN(电子版)9798400708428
DOI
出版状态已出版 - 12 8月 2024
活动53rd International Conference on Parallel Processing, ICPP 2024 - Gotland, 瑞典
期限: 12 8月 202415 8月 2024

出版系列

姓名ACM International Conference Proceeding Series

会议

会议53rd International Conference on Parallel Processing, ICPP 2024
国家/地区瑞典
Gotland
时期12/08/2415/08/24

指纹

探究 'PRoof: A Comprehensive Hierarchical Profiling Framework for Deep Neural Networks with Roofline Analysis' 的科研主题。它们共同构成独一无二的指纹。

引用此