Skip to main navigation Skip to search Skip to main content

Identifying scholarly communities from unstructured texts

  • Beihang University
  • National Computer Network Emergency Response Technical Team
  • Beijing Normal University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Scholarly community detection has important applications in various fields. Previous studies have relied heavily on structured scholar networks, which have high computational complexity and are challenging to construct in practice. We propose a novel alternative that can identify scholarly communities directly from large textual corpora. To our knowledge, this is the first study intended to detect communities directly from unstructured texts. Generally, academic articles tend to mention related work and researchers. Researchers that are more closely related to each other are mentioned in a closer grouping in lines of academic text. Based on this correlation, we develop an intuitional method that measures the mutual relatedness of researchers through their textual distance. First, we extract and disambiguate the researcher names from academic articles. Then, we embed each researcher as an implicit vector and measure the relatedness of researchers by their vector distance. Finally, the communities are identified by vector clusters. We implement and evaluate our method on three real-world datasets. The experimental results demonstrate that our method achieves better performance than state-of-the-art methods.

Original languageEnglish
Title of host publicationWeb and Big Data - Second International Joint Conference, APWeb-WAIM 2018, Proceedings
EditorsJianliang Xu, Yoshiharu Ishikawa, Yi Cai
PublisherSpringer Verlag
Pages75-89
Number of pages15
ISBN (Print)9783319968896
DOIs
StatePublished - 2018
Event2nd Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2018 - Macau, China
Duration: 23 Jul 201825 Jul 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10987 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2018
Country/TerritoryChina
CityMacau
Period23/07/1825/07/18

Keywords

  • Community detection
  • Representation learning
  • Scientific information extraction
  • Scientific literature analysis

Fingerprint

Dive into the research topics of 'Identifying scholarly communities from unstructured texts'. Together they form a unique fingerprint.

Cite this