TY - JOUR
T1 - All-Digital Computing-in-Memory Macro Supporting FP64-Based Fused Multiply-Add Operation
AU - Li, Dejian
AU - Mo, Kefan
AU - Liu, Liang
AU - Pan, Biao
AU - Li, Weili
AU - Kang, Wang
AU - Li, Lei
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2023/4
Y1 - 2023/4
N2 - Recently, frequent data movement between computing units and memory during floating-point arithmetic has become a major problem for scientific computing. Computing-in-memory (CIM) is a novel computing paradigm that merges computing logic into memory, which can address the data movement problem with excellent power efficiency. However, the previous CIM paradigm failed to support double-precision floating-point format (FP64) due to its computing complexity. This paper presents a novel all-digital CIM macro-DCIM-FF to complete FP64 based fused multiply-add (FMA) operation for the first time. With 16 sub-CIM cells integrating digital multipliers to complete mantissa multiplication, DCIM-FF is able to provide correct rounded implementations for normalized/denormalized inputs in round-to-nearest-even mode and round-to-zero mode, respectively. To evaluate our design, we synthesized and tested the DCIM-FF macro in 55-nm CMOS technology. With a minimum power efficiency of 0.12 mW and a maximum computing efficiency of 26.9 TOPS/W, we successfully demonstrated that DCIM-FF can run the FP64-based FMA operation without error. Compared to related works, the proposed DCIM-FF macro shows significant power efficiency improvement and less area overhead based on CIM technology. This work paves a novel pathway for high-performance implementation of an FP64-based matrix-vector multiplication (MVM) operation, which is essential for hyperscale scientific computing.
AB - Recently, frequent data movement between computing units and memory during floating-point arithmetic has become a major problem for scientific computing. Computing-in-memory (CIM) is a novel computing paradigm that merges computing logic into memory, which can address the data movement problem with excellent power efficiency. However, the previous CIM paradigm failed to support double-precision floating-point format (FP64) due to its computing complexity. This paper presents a novel all-digital CIM macro-DCIM-FF to complete FP64 based fused multiply-add (FMA) operation for the first time. With 16 sub-CIM cells integrating digital multipliers to complete mantissa multiplication, DCIM-FF is able to provide correct rounded implementations for normalized/denormalized inputs in round-to-nearest-even mode and round-to-zero mode, respectively. To evaluate our design, we synthesized and tested the DCIM-FF macro in 55-nm CMOS technology. With a minimum power efficiency of 0.12 mW and a maximum computing efficiency of 26.9 TOPS/W, we successfully demonstrated that DCIM-FF can run the FP64-based FMA operation without error. Compared to related works, the proposed DCIM-FF macro shows significant power efficiency improvement and less area overhead based on CIM technology. This work paves a novel pathway for high-performance implementation of an FP64-based matrix-vector multiplication (MVM) operation, which is essential for hyperscale scientific computing.
KW - digital computing-in-memory
KW - floating-point arithmetic
KW - fused multiply-add
KW - matrix-vector multiplication
KW - scientific computing
UR - https://www.scopus.com/pages/publications/85152694727
U2 - 10.3390/app13074085
DO - 10.3390/app13074085
M3 - 文章
AN - SCOPUS:85152694727
SN - 2076-3417
VL - 13
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 7
M1 - 4085
ER -