Skip to main navigation Skip to search Skip to main content

DocChat: An information retrieval approach for chatbot engines using unstructured documents

  • Zhao Yan
  • , Nan Duan
  • , Junwei Bao
  • , Peng Chen
  • , Ming Zhou
  • , Zhoujun Li
  • , Jianshe Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Most current chatbot engines are designed to reply to user utterances based on existing utterance-response (or Q-R)1 pairs. In this paper, we present DocChat, a novel information retrieval approach for chatbot engines that can leverage unstructured documents, instead of Q-R pairs, to respond to utterances. A learning to rank model with features designed at different levels of granularity is proposed to measure the relevance between utterances and responses directly. We evaluate our proposed approach in both English and Chinese: (i) For English, we evaluate Doc- Chat onWikiQA and QASent, two answer sentence selection tasks, and compare it with state-of-the-art methods. Reasonable improvements and good adaptability are observed. (ii) For Chinese, we compare DocChat with XiaoIce2, a famous chitchat engine in China, and side-by-side evaluation shows that DocChat is a perfect complement for chatbot engines using Q-R pairs as main source of responses.

Original languageEnglish
Title of host publication54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages516-525
Number of pages10
ISBN (Electronic)9781510827585
DOIs
StatePublished - 2016
Event54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: 7 Aug 201612 Aug 2016

Publication series

Name54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
Volume1

Conference

Conference54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period7/08/1612/08/16

Fingerprint

Dive into the research topics of 'DocChat: An information retrieval approach for chatbot engines using unstructured documents'. Together they form a unique fingerprint.

Cite this