Skip to main navigation Skip to search Skip to main content

PCGC: a performance compact graph compiler based on multilevel fusion-splitting rules

  • Dong Dong*
  • , Hongxu Jiang
  • , Hanqun Lin
  • , Yanfei Song
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The existing deep learning compilers are unable to perform efficient hardware performance-related graph fusion when both time and power consumption are considered. In addition, the compilers optimize the computational graph of deep neural networks (DNNs) by performing static graph transformation based on the greedy algorithm, only considering the runtime performance, and ignoring the cost of the tuning process. To solve these problems, this paper proposes a DNN computational graph optimization compiler (PCGC). Through the performance feedback at runtime, PCGC designs a computational graph fusion and splitting optimization strategy based on multilevel operator layer fusion-splitting rules. First, PCGC uses a rule-guided graph segmentation algorithm to recursively segment the computational graph into smaller subgraph to achieve an efficient and detailed search. Then, PCGC uses the cost model to receive feedback from hardware performance information, proposes cost model and operator fusion rules to synthetically guide the partial fusion and partitioning of nodes and edges of the computational graph, and generates optimal subgraphs flexibly according to different hardware to optimize the search space for partial fusion. Finally, we make the cost model converge quickly to the loss value we set by manually adjusting the parameters. Compared with other advanced compilers, PCGC optimizes the overall power consumption on an embedded GPU by an average of 130.5% when the time consumption on each hardware is not lower than the average time consumption. On domain-specific architecture, PCGC optimizes power consumption by an average of 66.5%. On FPGA, PCGC optimizes power consumption by 66.1%. In a sense, PCGC can achieve high-speed inference in specific power supply scenarios, reducing the carbon emissions of edge computing.

Original languageEnglish
Pages (from-to)17419-17444
Number of pages26
JournalJournal of Supercomputing
Volume79
Issue number15
DOIs
StatePublished - Oct 2023

Keywords

  • DNN graph compilers
  • Edge computing
  • Multilevel fusion rules
  • Partial dynamic tuning
  • Subgraph splitting

Fingerprint

Dive into the research topics of 'PCGC: a performance compact graph compiler based on multilevel fusion-splitting rules'. Together they form a unique fingerprint.

Cite this