Mining meaningful topics from massive biomedical literature

  • Peiyan Zhu
  • , Junhui Shen
  • , Dezhi Sun
  • , Ke Xu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

There is huge amount of biomedical and biological literature online or in digital libraries. Moreover, new research papers are published with an exponential growth in recent years. So it is pressing and challenging to mine meaningful topics from massive biomedical literature. The mined topics are helpful to researchers for literature exploration and topic discovery. However, latent topics inferred by traditional topic models are not always coherent and meaningful. In this work, we propose a new methodology to mine meaningful biomedical topics with a combination of several off-the-shelf text mining techniques such as part-of-speech tagging, base noun phrase chunking, K-means clustering and latent Dirichlet allocation, which endow our methodology with scalability and implementation simplicity. We conduct comprehensive experiments on a dataset collected from PubMed. The experimental results demonstrate that our method significantly outperforms a baseline method. We also perform a qualitative analysis and present meaningful biomedical topics and multi-word expressions.

Original languageEnglish
Title of host publicationProceedings - 2014 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2014
EditorsHuiru Zheng, Xiaohua Tony Hu, Daniel Berrar, Yadong Wang, Werner Dubitzky, Jin-Kao Hao, Kwang-Hyun Cho, David Gilbert
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages438-443
Number of pages6
ISBN (Electronic)9781479956692
DOIs
StatePublished - 29 Dec 2014
Event2014 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2014 - Belfast, United Kingdom
Duration: 2 Nov 20145 Nov 2014

Publication series

NameProceedings - 2014 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2014

Conference

Conference2014 IEEE International Conference on Bioinformatics and Biomedicine, IEEE BIBM 2014
Country/TerritoryUnited Kingdom
CityBelfast
Period2/11/145/11/14

Fingerprint

Dive into the research topics of 'Mining meaningful topics from massive biomedical literature'. Together they form a unique fingerprint.

Cite this