跳到主要导航 跳到搜索 跳到主要内容

A document-structure-based complex network model for extracting text keywords

  • Beihang University

科研成果: 期刊稿件文章同行评审

摘要

Keywords serving a dense summary of documents, are widely used in search engine and library to do information retrieval, content classification, speech recognition and automated text summarization. However, massive documents are lack of keywords, and the rapid generation of the large amount of content every day makes the human annotation really time-consuming. Lots of researches show that network-based approaches have remarkable performance for extracting text keywords. Traditionally, words are connected based upon their occurrence in documents. One recent work shows the significant influence of sentences on keywords extraction beyond the traditional methods only considering words. While in addition to words and sentences, chapters are the essential parts that are organized as the higher level semantic logic of the documents. Inspired by this idea, we therefore assume that chapters should contribute to the keyword extraction too. We further add the chapter factor to build a three-layer network model and propose a Word-Sentence-Chapter network-based approach for keywords extraction. Two experiments with Chinese and English documents respectively indicate that our approach outperforms the state of arts.

源语言英语
页(从-至)1765-1791
页数27
期刊Scientometrics
124
3
DOI
出版状态已出版 - 1 9月 2020

指纹

探究 'A document-structure-based complex network model for extracting text keywords' 的科研主题。它们共同构成独一无二的指纹。

引用此