LARQ: Learning to Ask and Rewrite Questions for Community Question Answering

  • Huiyang Zhou*
  • , Haoyan Liu
  • , Zhao Yan
  • , Yunbo Cao
  • , Zhoujun Li
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Taking advantage of the rapid growth of community platforms, such as Yahoo Answers, Quora, etc., Community Question Answering (CQA) systems are developed to retrieve semantically equivalent questions when users raise a new query. A typical CQA system mainly consists of two key components, a retrieval model and a ranking model, to search for similar questions and select the most related, respectively. In this paper, we propose LARQ, Learning to Ask and Rewrite Questions, which is a novel sentence-level data augmentation method. Different from common lexical-level data augmentation progresses, we take advantage of the Question Generation (QG) model to obtain more accurate, diverse, and semantically-rich query examples. Since the queries differ greatly in a low-resource code-start scenario, incorporating the QG model as an augmentation to the indexed collection significantly improves the response rate of CQA systems. We incorporate LARQ in an online CQA system and the Bank Question (BQ) Corpus to evaluate the enhancements for both the retrieval process and the ranking model. Extensive experimental results show that the LARQ enhanced model significantly outperforms single BERT and XGBoost models, as well as a widely-used QG model (NQG).

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 9th CCF International Conference, NLPCC 2020, Proceedings
EditorsXiaodan Zhu, Min Zhang, Yu Hong, Ruifang He
PublisherSpringer Science and Business Media Deutschland GmbH
Pages318-330
Number of pages13
ISBN (Print)9783030604561
DOIs
StatePublished - 2020
Event9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020 - Zhengzhou, China
Duration: 14 Oct 202018 Oct 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12431 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2020
Country/TerritoryChina
CityZhengzhou
Period14/10/2018/10/20

Keywords

  • Community Question Answering
  • Data augmentation
  • Question generation

Fingerprint

Dive into the research topics of 'LARQ: Learning to Ask and Rewrite Questions for Community Question Answering'. Together they form a unique fingerprint.

Cite this