Skip to main navigation Skip to search Skip to main content

Graph Regularized Non-negative Matrix Factorization with Long-tail Constraint

  • Lu You
  • , Rui Liu
  • , He Zhang
  • , Z. M. Shan
  • Beihang University
  • Tencent

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

How to dig out long tail topics is a great challenge in text mining. In previous research, most of non-hierarchical topic models were based on a hypothesis that the topics in documents follow polynomial distribution, ignoring the topics at the tail of distribution curve. Hierarchical topic model have the ability to mine long tail topics by introducing the hierarchical relationship among topics, but leading to a higher computational complexity. In this article, we propose a new method to mine long tail topics, which is called graph regularized non-negative matrix factorization with long-tail constraint. It uses KL divergence to measure the difference between matrices, and use neighbor graph to preserve the intrinsic geometrical and discriminating structure between original samples in low-dimensional space. Experiment shows, the algorithm we proposed can mine more long tail topic information in document, and make improvement in the task of data mining, comparing to other method, such as classical dirichlet distribution, non-negative matrix, hierarchical matrix, hierarchical latent dirichlet distribution.

Original languageEnglish
Title of host publication2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728127941
DOIs
StatePublished - Aug 2019
Event2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2019 - Victoria, Canada
Duration: 21 Aug 201923 Aug 2019

Publication series

Name2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2019 - Proceedings

Conference

Conference2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2019
Country/TerritoryCanada
CityVictoria
Period21/08/1923/08/19

Keywords

  • Data Mining
  • Long tail
  • Matrix Factorization

Fingerprint

Dive into the research topics of 'Graph Regularized Non-negative Matrix Factorization with Long-tail Constraint'. Together they form a unique fingerprint.

Cite this