Detecting Duplicate Questions in Stack Overflow via Deep Learning Approaches

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Stack Overflow is a popular question and answer website based on the software programming. Different users often ask the same questions in different ways, resulting in a large number of duplicate questions in Stack Overflow. Generally, the users with high reputation manually analyze and mark duplicate questions, which is time consuming and low efficiency. Therefore, the automatic duplicate question detection approach is demanded. We first investigate the application of deep learning models to software engineering task. Then, three deep learning models (i.e., CNN, RNN and LSTM) are applied to demonstrate whether they are effective to duplicate question detection task in Stack Overflow. In this paper, we explore three deep learning approaches DQ-CNN, DQ-RNN and DQ-LSTM based on CNN, RNN and LSTM to detect duplicate questions. The effectiveness of DQ-CNN, DQ-RNN and DQ-LSTM is evaluated by six different question groups. The experimental results show that DQ-LSTM outperforms DupPredictor, Dupe, DupePredictorRep-T and DupeRep in terms of recall-rate@5, recall-rate@10 and recall-rate@20 except for Ruby question group.

Original languageEnglish
Title of host publicationProceedings - 2019 26th Asia-Pacific Software Engineering Conference, APSEC 2019
PublisherIEEE Computer Society
Pages506-513
Number of pages8
ISBN (Electronic)9781728146485
DOIs
StatePublished - Dec 2019
Event26th Asia-Pacific Software Engineering Conference, APSEC 2019 - Putrajaya, Malaysia
Duration: 2 Dec 20195 Dec 2019

Publication series

NameProceedings - Asia-Pacific Software Engineering Conference, APSEC
Volume2019-December
ISSN (Print)1530-1362

Conference

Conference26th Asia-Pacific Software Engineering Conference, APSEC 2019
Country/TerritoryMalaysia
CityPutrajaya
Period2/12/195/12/19

Keywords

  • CNN
  • LSTM
  • RNN
  • Stack Overflow
  • duplicate questions

Fingerprint

Dive into the research topics of 'Detecting Duplicate Questions in Stack Overflow via Deep Learning Approaches'. Together they form a unique fingerprint.

Cite this