Skip to main navigation Skip to search Skip to main content

DRStencil: Exploiting Data Reuse within Low-order Stencil on GPU

  • Beihang University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Stencil computation is widely adopted in scientific applications as one of the most significant computation patterns. Although there are various optimizations proposed to accelerate the stencil computation, the low-order stencil still suffers from limited performance on GPU due to its low computation inten-sity. In this paper, we propose the fusion-partition optimization techniques to accelerate the low-order stencil computation and implement an effective code generation framework DRStencil to automatically generate optimized stencil codes with fusion-partition applied. Specifically, we adopt a four-stage optimization workflow such as time-fusion, partition, forward and backward computation. We also propose an auto-tuning method to deter-mine the optimal parameter settings of the generated stencil codes. We evaluate DRStencil with representative low-order stencils on Nvidia P100, V100, and A100 GPUs. Our evaluation results achieve 1.46 x, 1.59 x, and 1.10 x speedup on average for widely used low-order stencils compared to the state-of-the-art implementations on P100, V100, and A100 GPUs, respectively.

Original languageEnglish
Title of host publication2021 IEEE 23rd International Conference on High Performance Computing and Communications, 7th International Conference on Data Science and Systems, 19th International Conference on Smart City and 7th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages63-70
Number of pages8
ISBN (Electronic)9781665494571
DOIs
StatePublished - 2022
Event23rd IEEE International Conference on High Performance Computing and Communications, 7th IEEE International Conference on Data Science and Systems, 19th IEEE International Conference on Smart City and 7th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021 - Haikou, Hainan, China
Duration: 20 Dec 202122 Dec 2021

Publication series

Name2021 IEEE 23rd International Conference on High Performance Computing and Communications, 7th International Conference on Data Science and Systems, 19th International Conference on Smart City and 7th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021

Conference

Conference23rd IEEE International Conference on High Performance Computing and Communications, 7th IEEE International Conference on Data Science and Systems, 19th IEEE International Conference on Smart City and 7th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021
Country/TerritoryChina
CityHaikou, Hainan
Period20/12/2122/12/21

Keywords

  • GPU
  • Low-order Stencil
  • Performance Optimization
  • Semi-Stencil
  • Stencil Computation
  • Time Fusion

Fingerprint

Dive into the research topics of 'DRStencil: Exploiting Data Reuse within Low-order Stencil on GPU'. Together they form a unique fingerprint.

Cite this