TY - JOUR
T1 - Selecting text classification model through maximizing posterior evidence over informative sub-space
AU - Sun, Zhiwei
AU - Bai, Jun
AU - Chen, Zhuofan
AU - Li, Chen
AU - Rong, Wenge
AU - Xiong, Zhang
N1 - Publisher Copyright:
© Higher Education Press 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Text classification is a pivotal task in natural language understanding, and its performance has seen remarkable advancements with the rise of Pre-trained Language Models (PLMs). Recently, the proliferation of PLMs has made it increasingly challenging to choose the most suitable model for a given dataset. Since fine-tuning the sheer number of models is impractical, Transferability Estimation (TE) has become a promising solution to efficient model selection. Unlike current TE methods that focus solely on fixed and hard class assignments to evaluate the quality of model-encoded features, our approach further takes into account the inter-sample and inter-model variations represented by soft class assignments. We achieve this by utilizing class embeddings to predict posterior class assignments, with the logarithm of the maximum posterior evidence serving as the transferability score. Moreover, we found that the informative sub-space of the dataset can lead to more accurate calculation of soft class assignments, where we achieve efficient annotation of informative samples by eliciting the powerful judging ability of large language model. The resulting posterior evidence over the informative sub-space, LogIPE, enables us to capture subtle differences between models, enhancing the accuracy of model selection and validated by extensive experiments conducted on a wide range of text classification datasets as well as candidate PLMs.
AB - Text classification is a pivotal task in natural language understanding, and its performance has seen remarkable advancements with the rise of Pre-trained Language Models (PLMs). Recently, the proliferation of PLMs has made it increasingly challenging to choose the most suitable model for a given dataset. Since fine-tuning the sheer number of models is impractical, Transferability Estimation (TE) has become a promising solution to efficient model selection. Unlike current TE methods that focus solely on fixed and hard class assignments to evaluate the quality of model-encoded features, our approach further takes into account the inter-sample and inter-model variations represented by soft class assignments. We achieve this by utilizing class embeddings to predict posterior class assignments, with the logarithm of the maximum posterior evidence serving as the transferability score. Moreover, we found that the informative sub-space of the dataset can lead to more accurate calculation of soft class assignments, where we achieve efficient annotation of informative samples by eliciting the powerful judging ability of large language model. The resulting posterior evidence over the informative sub-space, LogIPE, enables us to capture subtle differences between models, enhancing the accuracy of model selection and validated by extensive experiments conducted on a wide range of text classification datasets as well as candidate PLMs.
KW - informative sub-space
KW - model selection
KW - posterior evidence
KW - text classification
UR - https://www.scopus.com/pages/publications/105009242024
U2 - 10.1007/s11704-025-41380-7
DO - 10.1007/s11704-025-41380-7
M3 - 文章
AN - SCOPUS:105009242024
SN - 2095-2228
VL - 19
JO - Frontiers of Computer Science
JF - Frontiers of Computer Science
IS - 12
M1 - 1912377
ER -