Enhancing Prompt Tuning for Smaller Pretrained Models via Knowledge Distillation

  • Mengyang Yuan
  • , Bo Lang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Prompt tuning, as a parameter-efficient fine-tuning method, plays a crucial role in the fine-tuning of pre-trained models. However, due to the limited expressive power of smaller pre-trained models, the performance of prompt tuning on these smaller models often falls short compared to the larger pre-trained models. To resolve this issue, we propose a knowledge distillation approach that leverages the knowledge of a larger teacher model to enhance the performance of prompt tuning on smaller models. Through analysis and experiments, we first determine that the logit-based distillation method is more suitable for prompt tuning compared to the feature-based method. Building on the commonly used inter-class relationship distillation, we then design and add a new loss function that enables the student model to learn the inter-instance relationships from the teacher model. This expands the information utilized from the teacher model, thereby further enhancing the distillation effect. Experimental results on multiple tasks in the SuperGLUE benchmark indicate that our method significantly enhances the prompt tuning performance of smaller models, even achieving or surpassing the results of larger teacher models in some tasks. Additionally, our method does not alter the structure of the student model, ensuring that the fine-tuned model retains all the advantages of prompt tuning during inference.

Original languageEnglish
Title of host publicationNeural Information Processing - 31st International Conference, ICONIP 2024, Proceedings
EditorsMufti Mahmud, Maryam Doborjeh, Kevin Wong, Andrew Chi Sing Leung, Zohreh Doborjeh, M. Tanveer
PublisherSpringer Science and Business Media Deutschland GmbH
Pages164-178
Number of pages15
ISBN (Print)9789819670291
DOIs
StatePublished - 2025
Event31st International Conference on Neural Information Processing, ICONIP 2024 - Auckland, New Zealand
Duration: 2 Dec 20246 Dec 2024

Publication series

NameCommunications in Computer and Information Science
Volume2295 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference31st International Conference on Neural Information Processing, ICONIP 2024
Country/TerritoryNew Zealand
CityAuckland
Period2/12/246/12/24

Keywords

  • Knowledge distillation
  • Parameter-efficient fine-tuning
  • Prompt tuning

Fingerprint

Dive into the research topics of 'Enhancing Prompt Tuning for Smaller Pretrained Models via Knowledge Distillation'. Together they form a unique fingerprint.

Cite this