跳到主要导航 跳到搜索 跳到主要内容

Improving grammatical error correction with machine translation pairs

  • Wangchunshu Zhou
  • , Tao Ge
  • , Chang Mu
  • , Ke Xu
  • , Furu Wei
  • , Ming Zhou
  • Bemang University
  • Microsoft USA
  • Peking University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

We propose a novel data synthesis method to generate diverse error-corrected sentence pairs for improving grammatical error correction, which is based on a pair of machine translation models (e.g., Chinese-English) of different qualities (i.e., poor and good). The poor translation model can resemble the ESL (En-gush as a second language) learner and tends to generate translations of low quality in terms of fluency and grammaticahty, while the good translation model generally generates fluent and grammatically correct translations. With the pair of translation models, we can generate unlimited numbers of poor-good English sentence pairs from text in the source language (e.g., Chinese) of the translators. Our approach can generate various error-corrected patterns and nicely complement the other data synthesis approaches for GEC. Experimental results demonstrate the data generated by our approach can effectively help a GEC model to improve the performance and approaching the state-of-the-art single-model performance in BEA-19 and CoNLL-14 benchmark datasets.

源语言英语
主期刊名Findings of the Association for Computational Linguistics Findings of ACL
主期刊副标题EMNLP 2020
出版商Association for Computational Linguistics (ACL)
318-328
页数11
ISBN(电子版)9781952148903
出版状态已出版 - 2020
已对外发布
活动Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020 - Virtual, Online
期限: 16 11月 202020 11月 2020

出版系列

姓名Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020

会议

会议Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020
Virtual, Online
时期16/11/2020/11/20

指纹

探究 'Improving grammatical error correction with machine translation pairs' 的科研主题。它们共同构成独一无二的指纹。

引用此