TY - GEN
T1 - FusionASAG
T2 - 6th International Conference on Computer Science and Educational Informatization, CSEI 2024
AU - Zheng, He
AU - Sun, Qing
AU - Li, Qiushuo
AU - Liu, Yunxin
AU - Ouyang, Yuanxin
AU - Cao, Qinghua
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Subjective questions are crucial to assess students’ ability to analyze, synthesize, evaluate and create knowledge. In the massive online education scenarios, the manually scoring of subjective questions is time-consuming. Instead, it could be supported by the task of Short Answer Grading in Natural Language Process. However, it is worth noting that most existing automatic scoring system does not perform well on domain-specific and long questions. In this paper we address the challenges of automated short answer grading (ASAG) by proposing a novel scoring approach that strategically integrates a fine-tuned large language model (LLM), a neural network (NN) for feature extraction, and an answer-question relevance assessment module (RELEVANCE). Our method effectively scores student responses based on a set of predefined rubrics and reference answers. Our experiments on the ASAP-SAS dataset demonstrate that our method achieves an average Quadratic Weighted Kappa (QWK) score of 0.797, surpassing current state-of-the-art AutoSAS model, particularly excelling in longer tasks with a 11.9% improvement. Overall, our proposed method offers a robust solution for subjective question grading, ultimately contributing to more efficient educational assessment in a rapidly evolving learning environment.
AB - Subjective questions are crucial to assess students’ ability to analyze, synthesize, evaluate and create knowledge. In the massive online education scenarios, the manually scoring of subjective questions is time-consuming. Instead, it could be supported by the task of Short Answer Grading in Natural Language Process. However, it is worth noting that most existing automatic scoring system does not perform well on domain-specific and long questions. In this paper we address the challenges of automated short answer grading (ASAG) by proposing a novel scoring approach that strategically integrates a fine-tuned large language model (LLM), a neural network (NN) for feature extraction, and an answer-question relevance assessment module (RELEVANCE). Our method effectively scores student responses based on a set of predefined rubrics and reference answers. Our experiments on the ASAP-SAS dataset demonstrate that our method achieves an average Quadratic Weighted Kappa (QWK) score of 0.797, surpassing current state-of-the-art AutoSAS model, particularly excelling in longer tasks with a 11.9% improvement. Overall, our proposed method offers a robust solution for subjective question grading, ultimately contributing to more efficient educational assessment in a rapidly evolving learning environment.
KW - Automatic Short Answer Grading
KW - Generative Language Model
KW - Model Fine-Tuning
KW - Text Feature Extraction
UR - https://www.scopus.com/pages/publications/105003210288
U2 - 10.1007/978-981-96-3735-5_4
DO - 10.1007/978-981-96-3735-5_4
M3 - 会议稿件
AN - SCOPUS:105003210288
SN - 9789819637348
T3 - Communications in Computer and Information Science
SP - 39
EP - 52
BT - Computer Science and Educational Informatization - 6th International Conference, CSEI 2024, Revised Selected Papers
A2 - Zhang, Kun
A2 - Song, Xianhua
A2 - Obaidat, Mohammad S.
A2 - Bilal, Anas
A2 - Hu, Jun
A2 - Lu, Zeguang
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 1 November 2024 through 3 November 2024
ER -