摘要
Large pre-trained visual models, such as Vision Transformer (ViT), have shown excellent results in a variety of visual processing tasks. However, the large parameter sizes of these models make them difficult to deploy in applications that require limited resources for fast real-time inference. Existing model compression methods compress pre-trained ViTs into models of the same structure, but the compressed models fail to maximise performance on different tasks and consume significant computational and storage resources to adapt to downstream tasks. Pre-trained ViT models learn rich knowledge from large datasets and require different knowledge for different tasks, so removing task-specific redundant knowledge from pre-trained models is the key to achieve model compression while maximising model performance on different tasks. Based on this idea, we propose a novel ViT compression method, AdaViT. Firstly, a lightweight module is introduced, which adds less than 2% of extra parameters to realise the adaptive tuning of the model to different tasks to show the best prediction, and its effectiveness is proved by experiments. Second, inspired by the idea of knowledge distillation, we propose a new module replacement compression method, which effectively compresses the ViT by gradually replacing the Transformer module in the original ViT, and achieves task-oriented adaptive ViT compression through the combination of adaptive modules and model replacement methods. We evaluate AdaViT on several visual classification tasks and compare it with other ViT compression methods, demonstrating the effectiveness of task-adaptive ViT compression.
| 源语言 | 英语 |
|---|---|
| 主期刊名 | Proceedings - 2024 10th International Conference on Big Data Computing and Communications, BIGCOM 2024 |
| 出版商 | Institute of Electrical and Electronics Engineers Inc. |
| 页 | 118-125 |
| 页数 | 8 |
| 版本 | 2024 |
| ISBN(电子版) | 9798331509538 |
| DOI | |
| 出版状态 | 已出版 - 2024 |
| 活动 | 10th International Conference on Big Data Computing and Communications, BIGCOM 2024 - Dalian, 中国 期限: 9 8月 2024 → 11 8月 2024 |
会议
| 会议 | 10th International Conference on Big Data Computing and Communications, BIGCOM 2024 |
|---|---|
| 国家/地区 | 中国 |
| 市 | Dalian |
| 时期 | 9/08/24 → 11/08/24 |
指纹
探究 'AdaViT: Task Adaptive ViT Compression Based on Module Replacement' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver