TY - GEN
T1 - ESC
T2 - 54th International Conference on Parallel Processing, ICPP 2025
AU - Wang, Xuezhu
AU - Yang, Hailong
AU - You, Xin
AU - Xu, Yufan
AU - Liu, Xiaoyan
AU - Wang, Siqi
AU - Zhang, Kaige
AU - Li, Mingzhen
AU - Luan, Zhongzhi
AU - Liu, Yi
AU - Qian, Depei
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/12/20
Y1 - 2025/12/20
N2 - Submanifold convolution is an effective method to process 3D point cloud data, playing a significant role in fields such as robotics, autonomous driving, and AR/VR. However, due to the high sparsity and irregularity of point cloud data, it is challenging to accelerate submanifold convolution on modern GPUs, especially using tensor cores. Previous works have proposed implicit GEMM methods to accelerate submanifold convolution on GPU. However, the performance of such methods is limited by massive redundant computation and suboptimal parameter configurations. In this paper, we propose ESC, a new method to leverage GPU tensor cores for accelerating submanifold convolution with improved performance. Firstly, we propose an online similarity-aware reordering method to increase the point cloud data locality and yield more opportunities for eliminating redundancy. Secondly, we propose TC-aware redundancy elimination to reduce the redundant computation at the fine TC-tile granularity. Moreover, we propose an adaptive configuration selector to select the optimal configuration based on offline profiling results and online input data. Experimental results demonstrate that ESC outperforms the state-of-the-art works on representative datasets.
AB - Submanifold convolution is an effective method to process 3D point cloud data, playing a significant role in fields such as robotics, autonomous driving, and AR/VR. However, due to the high sparsity and irregularity of point cloud data, it is challenging to accelerate submanifold convolution on modern GPUs, especially using tensor cores. Previous works have proposed implicit GEMM methods to accelerate submanifold convolution on GPU. However, the performance of such methods is limited by massive redundant computation and suboptimal parameter configurations. In this paper, we propose ESC, a new method to leverage GPU tensor cores for accelerating submanifold convolution with improved performance. Firstly, we propose an online similarity-aware reordering method to increase the point cloud data locality and yield more opportunities for eliminating redundancy. Secondly, we propose TC-aware redundancy elimination to reduce the redundant computation at the fine TC-tile granularity. Moreover, we propose an adaptive configuration selector to select the optimal configuration based on offline profiling results and online input data. Experimental results demonstrate that ESC outperforms the state-of-the-art works on representative datasets.
KW - Sparse Computation
KW - Submanifold Convolution
KW - Tensor Core
UR - https://www.scopus.com/pages/publications/105026443425
U2 - 10.1145/3754598.3754633
DO - 10.1145/3754598.3754633
M3 - 会议稿件
AN - SCOPUS:105026443425
T3 - 54th International Conference on Parallel Processing, ICPP 2025 - Main Conference Proceedings
SP - 575
EP - 585
BT - 54th International Conference on Parallel Processing, ICPP 2025 - Main Conference Proceedings
PB - Association for Computing Machinery, Inc
Y2 - 8 September 2025 through 11 September 2025
ER -