Abstract
While tensor accelerated compilers have proven effective in deploying deep neural networks (DNN) on general-purpose hardware, optimizing for FPGA remains challenging due to the complex DNN architectures and the heterogeneous, semi-open compute units. This paper introduces the Automatic Kernel Generation for DNN on CPU-FPGA (AKGF) framework for efficient deployment of DNN on heterogeneous CPU-FPGA platforms. AKGF generates an intermediate representation (IR) of the DNN using TVM’s Halide IR, annotates the operators of model layers in the IR to compute them on the corresponding hardware cores, and further optimizes the operator code for CPU and FPGA using ARM’s function library and the polyhedral model to enhance model inference speed and power consumption. The experimental tests conducted on a CPU-FPGA board validate the effectiveness of AKGF, demonstrating significant acceleration ratios (up to 6.7x) compared to state-of-the-art accelerators while achieving a 2x power optimization. AKGF effectively leverages the computational capabilities of both CPU and FPGA for high-performance deployment of DNN on CPU-FPGA platforms.
| Original language | English |
|---|---|
| Pages (from-to) | 1619-1627 |
| Number of pages | 9 |
| Journal | Computer Journal |
| Volume | 67 |
| Issue number | 5 |
| DOIs | |
| State | Published - 1 May 2024 |
Keywords
- CPU-FPGA
- DNN accelerated compilers
- heterogeneous computing
- polyhedral model
Fingerprint
Dive into the research topics of 'AKGF: Automatic Kernel Generation for DNN on CPU-FPGA'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver