Skip to main navigation Skip to search Skip to main content

LMIE-BERT: A Learnable Method for Inter-Layer Ensembles to Accelerate Inference of BERT-Style Pre-trained Models

  • Weikai Qi*
  • , Xing Guo
  • , Haohua Du
  • *Corresponding author for this work
  • School of Computer Science and Technology, Anhui University
  • USTC-DEQING Alpha Innovation Institute

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Pre-trained models have brought tremendous accuracy improvements to Natural Language Processing(NLP) and Computer Vision tasks, but they suffer from slow inference speed due to the heavy model, which hinders their deployment in production. The early exit methods have been proposed to accelerate the inference speed of large pre-trained models. However, these methods will lose control of accuracy at higher speed ratios. In order to balance the trade-off between model speed and accuracy better, we propose a novel early-exit mechanism called LMIE-BERT. To achieve this, we introduce a learnable method for inter-layer ensemble strategy in the internal classifier, it trains the model to fit the information from both the previous and current layers, which enables the early exit method to get more robust results. The experimental results demonstrate that LMIE-BERT can maintain over 90% of the accuracy of the original model while achieving a 4× inference speed up in multiple tasks. Our method is ahead of other early exit methods in terms of model accuracy for the same speed ratio.

Original languageEnglish
Title of host publicationProceedings - 2023 9th International Conference on Big Data Computing and Communications, BigCom 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages271-277
Number of pages7
ISBN (Electronic)9798350331240
DOIs
StatePublished - 2023
Event9th International Conference on Big Data Computing and Communications, BigCom 2023 - Hainan, China
Duration: 4 Aug 20236 Aug 2023

Publication series

NameProceedings - 2023 9th International Conference on Big Data Computing and Communications, BigCom 2023

Conference

Conference9th International Conference on Big Data Computing and Communications, BigCom 2023
Country/TerritoryChina
CityHainan
Period4/08/236/08/23

Keywords

  • BERT
  • Early Exit
  • Model Compression

Fingerprint

Dive into the research topics of 'LMIE-BERT: A Learnable Method for Inter-Layer Ensembles to Accelerate Inference of BERT-Style Pre-trained Models'. Together they form a unique fingerprint.

Cite this