Skip to main navigation Skip to search Skip to main content

Multisource Soft Labeling and Hard Negative Sampling for Retrieval Distractor Ranking

Research output: Contribution to journalArticlepeer-review

Abstract

Multiple-choice questions (MCQs) are a kind of widely adopted approaches in learning assessment. Recently, the automatic generation of MCQs has become a popular research area. In this task, distractor ranking (DR) is one of the most meaningful and challenging subtasks, where the DR models learn to select high-quality distractors from numerous candidates. Currently, some DR methods adopt a two-stage ranking strategy, which brings about a complex process and error propagation. Others directly use the single-encoder-based model to improve the overall performance, which, however, suffers from low efficiency. To tackle these problems, we propose a retrieval distractor ranking (ReDR) task to meet the requirements for practical distractor retrieval scenarios, in which the models should achieve relatively high performance within an acceptable time. In this research, we develop an end-to-end way based on the dual-encoder framework to solve the ReDR task. Besides, we propose multiple kinds of relevance scores, including context-context, context-distractor, and distractor-distractor, which have been employed in two strategies: 1) multisource soft labeling, which assigns each candidate an appropriate soft label from multiple kinds of relevance scores to better simulate the sample distribution of the ReDR task and 2) multisource hard negative sampling, which selects the hard negative samples according to multiple kinds of relevance scores and further distinguishes the difference between them and the positive samples. The extensive experiments on two well-known MCQ benchmarks have proven the effectiveness of our method.

Original languageEnglish
Pages (from-to)664-676
Number of pages13
JournalIEEE Transactions on Learning Technologies
Volume17
DOIs
StatePublished - 2024

Keywords

  • Distractor ranking (DR)
  • hard negative sampling
  • multiple-choice questions (MCQs)
  • soft labeling

Fingerprint

Dive into the research topics of 'Multisource Soft Labeling and Hard Negative Sampling for Retrieval Distractor Ranking'. Together they form a unique fingerprint.

Cite this