摘要
Model quantization is a technique that optimizes neural network computation by converting weight parameters and activation values from floating-point numbers to low-bit integers or fixed-point representations. This reduces storage and computational cost and improves computational efficiency. Currently, common quantization methods, such as QAT and PTQ, optimize quantization parameters using training data to achieve the best performance. However, in practical applications, there may be little or no data available for downstream model quantization due to restrictions such as privacy and security. Therefore, researching how to perform model quantization without data is essential. This article proposes a data-free quantization technique called DFFG, based on fast gradient iteration, which uses information learned from the full-precision model, such as the BN layer, to recover the distribution of the original training data. We propose, for the first time, using a momentum-assisted variant of the FGSM gradient iteration strategy to update the generated data. This approach enables quick perturbation of the optimized data while maintaining the diversity of the generated data through the manipulation of gradient variability. We also propose using intermediate data generated during the iteration process as a part of data for subsequent model quantization, greatly improving the speed of data generation. We have demonstrated the effectiveness of our proposed method through empirical evaluations. Our method generates data that not only ensures model quantization performance but also significantly surpasses other similar data generation techniques in terms of speed. Specifically, our approach is 10X faster than ZeroQ.
| 源语言 | 英语 |
|---|---|
| 出版状态 | 已出版 - 2023 |
| 活动 | 34th British Machine Vision Conference, BMVC 2023 - Aberdeen, 英国 期限: 20 11月 2023 → 24 11月 2023 |
会议
| 会议 | 34th British Machine Vision Conference, BMVC 2023 |
|---|---|
| 国家/地区 | 英国 |
| 市 | Aberdeen |
| 时期 | 20/11/23 → 24/11/23 |
指纹
探究 'DFFG: Fast Gradient Iteration for Data-free Quantization' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver