Skip to main navigation Skip to search Skip to main content

Adapting combined tiling to stencil optimizations on sunway processor

  • Beihang University
  • Science and Technology on Special System Simulation Laboratory

Research output: Contribution to journalArticlepeer-review

Abstract

Stencil is one of the indispensable computation patterns in scientific applications, which is a long-standing optimization target in the field of high performance computing (HPC). The Sunway processor adopted in Sunway TaihuLight supercomputer has demonstrated its performance potential with unique heterogeneous many-core architecture. Although a large number of optimization methods have been proposed, the memory-bound nature of stencil computation and the limited bandwidth of Sunway processor make it challenging to adapt stencil computation efficiently on Sunway processor. To better use the computation capability of Sunway processor, we propose a combined tiling optimization of stencil computation tailored for the architectural features. In addition, we implement double buffering, vectorization, and register communication to further accelerate stencil computation on Sunway processor. We evaluate our method on six stencil benchmarks with different orders and shapes (thus different memory access patterns and computation intensities). The experimental results show that our implementation can achieve 1.97 × speedup on average compared to the state-of-the-art stencil implementation on Sunway.

Original languageEnglish
Pages (from-to)322-333
Number of pages12
JournalCCF Transactions on High Performance Computing
Volume5
Issue number3
DOIs
StatePublished - Sep 2023

Keywords

  • Combined tiling
  • Performance optimization
  • Stencil computation
  • Sunway processor

Fingerprint

Dive into the research topics of 'Adapting combined tiling to stencil optimizations on sunway processor'. Together they form a unique fingerprint.

Cite this