跳到主要导航 跳到搜索 跳到主要内容

CIM2PQ: An Arraywise and Hardware-Friendly Mixed Precision Quantization Method for Analog Computing-In-Memory

  • Beihang University

科研成果: 期刊稿件文章同行评审

摘要

Computing-in-memory (CIM) architecture is a promising convolutional neural network (CNN) accelerator known for its highly efficient matrix-vector multiplications (MVMs). However, due to the low-precision computation and limited size of CIM memory arrays, it is necessary to decompose the huge MVMs into smaller subsets. Conventional NN quantization methods overlook the characteristics of CIM hardware, resulting in diminished system performance and efficiency. This article proposes a mixed precision quantization (MPQ) method based on evolutionary algorithm for CIM-based accelerators, while considering the hardware characteristics of CIM, called CIM2PQ, which can automatically generate quantization strategies for NN model to improve the efficiency of CIM systems. First, inspired by the CIM computing paradigm, an arraywise quantization granularity is introduced in the MPQ search space, which can jointly quantize the inputs, weights, and partial sums. Second, a production procedure containing fine-grained crossover and progressive adaptive mutation is proposed, which can efficiently explore the search space and speed up the search process. Third, we propose a fast and efficient strategy evaluation method to obtain the performance of quantization strategy on the CIM platform, saving the evaluation time significantly without requiring fine-Tuning. Finally, to protect CIM-friendly strategies with lower bit-widths but worse-Algorithm performance, we propose a strategy selection method based on multiobjective optimization, named qNSGA-III. The effectiveness of the proposed method has been demonstrated through experimental results of various NNs and datasets. For ResNet-18, the hardware efficiency and accuracy can be improved to 117% with 7.05%, 113% with 3.37%, and 119% with 5.78%, on CIFAR-10, CIFAR-100, and ImageNet, respectively, compared to the baseline MPQ method.

源语言英语
页(从-至)2084-2097
页数14
期刊IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
43
7
DOI
出版状态已出版 - 1 7月 2024

指纹

探究 'CIM2PQ: An Arraywise and Hardware-Friendly Mixed Precision Quantization Method for Analog Computing-In-Memory' 的科研主题。它们共同构成独一无二的指纹。

引用此