TCS: Efficient topic discovery over crowd-oriented service data

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, with the widespread usage of Web 2.0 techniques, crowdsourcing plays an important role in offering human intelligence in various service websites, such as Yahoo! Answer and Quora. With the increasing amount of crowd-oriented service data, an important task is to analyze latest hot topics and track topic evolution over time. However, the existing techniques in text mining cannot effectively work due to the unique structure of crowd-oriented service data, task-response pairs, which consists of the task and its corresponding responses. In particular, existing approaches become ineffective with the ever-increasing crowd-oriented service data that accumulate along the time. In this paper, we first study the problem of discovering topics over crowd-oriented service data. Then we propose a new probabilistic topic model, the Topic Crowd Service Model (TCS model), to effectively discover latent topics from massive crowd-oriented service data. In particular, in order to train TCS efficiently, we design a novel parameter inference algorithm, the Bucket Parameter Estimation (BPE), which utilizes belief propagation and a new sketching technique, called Pairwise Sketch (pSketch). Finally, we conduct extensive experiments to verify the effectiveness and efficiency of the TCS model and the BPE algorithm.

Original languageEnglish
Title of host publicationKDD 2014 - Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages861-870
Number of pages10
ISBN (Print)9781450329569
DOIs
StatePublished - 2014
Externally publishedYes
Event20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014 - New York, NY, United States
Duration: 24 Aug 201427 Aug 2014

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014
Country/TerritoryUnited States
CityNew York, NY
Period24/08/1427/08/14

Keywords

  • crowd-oriented service
  • crowdsourcing
  • probabilistic topic model

Fingerprint

Dive into the research topics of 'TCS: Efficient topic discovery over crowd-oriented service data'. Together they form a unique fingerprint.

Cite this