TY - GEN
T1 - TCS
T2 - 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014
AU - Tong, Yongxin
AU - Cao, Caleb Chen
AU - Chen, Lei
PY - 2014
Y1 - 2014
N2 - In recent years, with the widespread usage of Web 2.0 techniques, crowdsourcing plays an important role in offering human intelligence in various service websites, such as Yahoo! Answer and Quora. With the increasing amount of crowd-oriented service data, an important task is to analyze latest hot topics and track topic evolution over time. However, the existing techniques in text mining cannot effectively work due to the unique structure of crowd-oriented service data, task-response pairs, which consists of the task and its corresponding responses. In particular, existing approaches become ineffective with the ever-increasing crowd-oriented service data that accumulate along the time. In this paper, we first study the problem of discovering topics over crowd-oriented service data. Then we propose a new probabilistic topic model, the Topic Crowd Service Model (TCS model), to effectively discover latent topics from massive crowd-oriented service data. In particular, in order to train TCS efficiently, we design a novel parameter inference algorithm, the Bucket Parameter Estimation (BPE), which utilizes belief propagation and a new sketching technique, called Pairwise Sketch (pSketch). Finally, we conduct extensive experiments to verify the effectiveness and efficiency of the TCS model and the BPE algorithm.
AB - In recent years, with the widespread usage of Web 2.0 techniques, crowdsourcing plays an important role in offering human intelligence in various service websites, such as Yahoo! Answer and Quora. With the increasing amount of crowd-oriented service data, an important task is to analyze latest hot topics and track topic evolution over time. However, the existing techniques in text mining cannot effectively work due to the unique structure of crowd-oriented service data, task-response pairs, which consists of the task and its corresponding responses. In particular, existing approaches become ineffective with the ever-increasing crowd-oriented service data that accumulate along the time. In this paper, we first study the problem of discovering topics over crowd-oriented service data. Then we propose a new probabilistic topic model, the Topic Crowd Service Model (TCS model), to effectively discover latent topics from massive crowd-oriented service data. In particular, in order to train TCS efficiently, we design a novel parameter inference algorithm, the Bucket Parameter Estimation (BPE), which utilizes belief propagation and a new sketching technique, called Pairwise Sketch (pSketch). Finally, we conduct extensive experiments to verify the effectiveness and efficiency of the TCS model and the BPE algorithm.
KW - crowd-oriented service
KW - crowdsourcing
KW - probabilistic topic model
UR - https://www.scopus.com/pages/publications/84907033663
U2 - 10.1145/2623330.2623647
DO - 10.1145/2623330.2623647
M3 - 会议稿件
AN - SCOPUS:84907033663
SN - 9781450329569
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 861
EP - 870
BT - KDD 2014 - Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 24 August 2014 through 27 August 2014
ER -