TY - GEN
T1 - CRISP
T2 - 8th IEEE International Test Conference in Asia, ITC-Asia 2024
AU - Zhang, Shangtong
AU - Wang, Xueyan
AU - Zhao, Weisheng
AU - Jin, Yier
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Triangle Counting is a fundamental problem in graph analysis, which usually needs to traverse the graph and perform set-intersections of neighbor sets. However, existing approaches suffer from heavy off-chip memory access and set-intersection overhead, which are both memory-bound and computation-bound. Fortunately, the emerging 3D-stacked computation-in-memory (CIM) architecture can reduce off-chip memory access, and the content addressable memory (CAM) can achieve parallel comparison. However, existing solutions have not effectively combined the high bandwidth of 3D-stacked memory with the high computational capabilities of CAM arrays. Besides, there exist many fruitless searches in the triangle counting process. Thus, we propose CRISP, a software-hardware co-design architecture to address these issues. At the level of software design, a new storage format named Two-Pointer CSR is proposed to eliminate fruitless searches during the set-intersection process. At the level of hardware design, CRISP integrates a novel Presence-Bits based Content Addressable Memory (PB-CAM) near the memory bank of 3D-stacked memory to fully exploit the high internal bandwidth. Through the presence bits comparison, the PB-CAM can effectively reduce both the off-chip memory access and set-intersection operations. Experimental results show that compared with previous state-of-the-art near-DIMM and HBM-PIM triangle counting accelerators, CRISP achieves speedups of 5.7× and 1.8 respectively.
AB - Triangle Counting is a fundamental problem in graph analysis, which usually needs to traverse the graph and perform set-intersections of neighbor sets. However, existing approaches suffer from heavy off-chip memory access and set-intersection overhead, which are both memory-bound and computation-bound. Fortunately, the emerging 3D-stacked computation-in-memory (CIM) architecture can reduce off-chip memory access, and the content addressable memory (CAM) can achieve parallel comparison. However, existing solutions have not effectively combined the high bandwidth of 3D-stacked memory with the high computational capabilities of CAM arrays. Besides, there exist many fruitless searches in the triangle counting process. Thus, we propose CRISP, a software-hardware co-design architecture to address these issues. At the level of software design, a new storage format named Two-Pointer CSR is proposed to eliminate fruitless searches during the set-intersection process. At the level of hardware design, CRISP integrates a novel Presence-Bits based Content Addressable Memory (PB-CAM) near the memory bank of 3D-stacked memory to fully exploit the high internal bandwidth. Through the presence bits comparison, the PB-CAM can effectively reduce both the off-chip memory access and set-intersection operations. Experimental results show that compared with previous state-of-the-art near-DIMM and HBM-PIM triangle counting accelerators, CRISP achieves speedups of 5.7× and 1.8 respectively.
KW - content addressable memory
KW - triangle counting
KW - ×3D-stacked memory
UR - https://www.scopus.com/pages/publications/85204779438
U2 - 10.1109/ITC-Asia62534.2024.10661308
DO - 10.1109/ITC-Asia62534.2024.10661308
M3 - 会议稿件
AN - SCOPUS:85204779438
T3 - Proceedings - ITC-Asia 2024: 8th IEEE International Test Conference in Asia
BT - Proceedings - ITC-Asia 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 August 2024 through 20 August 2024
ER -