Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture

Research output: Contribution to journalArticlepeer-review

Abstract

To improve the performance of sparse Cholesky factorization, existing research divides the adjacent columns of the sparse matrix with the same nonzero patterns into supernodes for parallelization. However, due to the various structures of sparse matrices, the computation of the generated supernodes varies significantly, and thus hard to optimize when computed by dense matrix kernels. Therefore, how to efficiently map sparse Choleksy factorization to the emerging architectures, such as Sunway many-core processor, remains an active research direction. In this article, we propose swCholesky, which is a highly optimized implementation of sparse Cholesky factorization on Sunway processor. Specifically, we design three kernel task queues and a dense matrix library to dynamically adapt to the kernel characteristics and architecture features. In addition, we propose an auto-tuning mechanism to search for the optimal settings of the important parameters in swCholesky. Our experiments show that swCholesky achieves better performance than state-of-the-art implementations.

Original languageEnglish
Article number8903486
Pages (from-to)1636-1650
Number of pages15
JournalIEEE Transactions on Parallel and Distributed Systems
Volume31
Issue number7
DOIs
StatePublished - 1 Jul 2020

Keywords

  • Sparse Cholesky factorization
  • Sunway architecture
  • performance optimization

Fingerprint

Dive into the research topics of 'Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture'. Together they form a unique fingerprint.

Cite this