Skip to main navigation Skip to search Skip to main content

Automorphism Ensemble Decoding on GPU: Achieving High Throughput and Low Latency for Polar and RM Codes

  • Yansong Li
  • , Kairui Tian
  • , Rongke Liu*
  • *Corresponding author for this work
  • Beihang University

Research output: Contribution to journalArticlepeer-review

Abstract

Automorphism ensemble decoding (AED) is a highly parallel approach that enables decoding of polar and Reed-Muller (RM) codes with automorphisms, offering a practical solution with near-maximum likelihood (ML) performance and manageable computational complexity. To meet the growing demands for high throughput and low latency in cloud and virtual random access networks, this paper presents a graphics processing unit (GPU)-based AED architecture for polar and RM codes, utilizing low-complexity successive cancellation (SC) and small list SC (SCL) decoders as the constituent of AED. The proposed architecture exploits the inherent parallelism of AED to optimize decoding tasks on the GPU, significantly enhancing throughput by efficiently harnessing the massive parallel processing capabilities of the GPU. Additionally, improved thread mapping and data management techniques substantially reduce latency for automorphism ensemble SC (Aut-SC) decoding, while a low-latency sorting mechanism further accelerates automorphism ensemble SCL (Aut-SCL) decoding. Experimental results on an NVIDIA RTX 4090 demonstrate that the proposed Aut-SC decoder, with an ensemble size of 8, achieves a throughput exceeding 17 Gbps under highly parallelized batch processing. Compared to the state-of-the-art software-based SCL decoders, the proposed GPU-based Aut-SC and Aut-SCL architectures outperform existing solutions by factors of up to 28x and 10x, respectively, in normalized throughput while maintaining the same or even superior error correction performance.

Original languageEnglish
Pages (from-to)2227-2242
Number of pages16
JournalIEEE Transactions on Signal Processing
Volume73
DOIs
StatePublished - 2025

Keywords

  • GPU implementation
  • Polar codes
  • RM codes
  • automorphism ensemble decoding
  • successive cancellation decoding
  • successive cancellation list decoding

Fingerprint

Dive into the research topics of 'Automorphism Ensemble Decoding on GPU: Achieving High Throughput and Low Latency for Polar and RM Codes'. Together they form a unique fingerprint.

Cite this