TY - GEN
T1 - CrowdTC
T2 - 15th IEEE International Conference on Data Mining, ICDM 2015
AU - Meng, Rui
AU - Tong, Yongxin
AU - Chen, Lei
AU - Cao, Caleb Chen
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/5
Y1 - 2016/1/5
N2 - Recently, taxonomy has attracted much attention. Both automatic construction solutions and human-based computation approaches have been proposed. The automatic methods suffer from the problem of either low precision or low recall and human computation, on the other hand, is not suitable for large scale tasks. Motivated by the shortcomings of both approaches, we present a hybrid framework, which combines the power of machine-based approaches and human computation (the crowd) to construct a more complete and accurate taxonomy. Specifically, our framework consists of two steps: we first construct a complete but noisy taxonomy automatically, then crowd is introducedto adjust the entity positions in the constructed taxonomy. However, the adjustment is challenging as the budget (money) for asking the crowd is often limited. In our work, we formulatethe problem of finding the optimal adjustment as an entityselection optimization (ESO) problem, which is proved to beNP-hard. We then propose an exact algorithm and a moreefficient approximation algorithm with an approximation ratioof 1/2(1-1/e). We conduct extensive experiments on real datasets, the results show that our hybrid approach largely improves the recall of the taxonomy with little impairment for precision.
AB - Recently, taxonomy has attracted much attention. Both automatic construction solutions and human-based computation approaches have been proposed. The automatic methods suffer from the problem of either low precision or low recall and human computation, on the other hand, is not suitable for large scale tasks. Motivated by the shortcomings of both approaches, we present a hybrid framework, which combines the power of machine-based approaches and human computation (the crowd) to construct a more complete and accurate taxonomy. Specifically, our framework consists of two steps: we first construct a complete but noisy taxonomy automatically, then crowd is introducedto adjust the entity positions in the constructed taxonomy. However, the adjustment is challenging as the budget (money) for asking the crowd is often limited. In our work, we formulatethe problem of finding the optimal adjustment as an entityselection optimization (ESO) problem, which is proved to beNP-hard. We then propose an exact algorithm and a moreefficient approximation algorithm with an approximation ratioof 1/2(1-1/e). We conduct extensive experiments on real datasets, the results show that our hybrid approach largely improves the recall of the taxonomy with little impairment for precision.
KW - Crowdsourcing
KW - Taxonomy Construction
UR - https://www.scopus.com/pages/publications/84963626426
U2 - 10.1109/ICDM.2015.77
DO - 10.1109/ICDM.2015.77
M3 - 会议稿件
AN - SCOPUS:84963626426
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 913
EP - 918
BT - Proceedings - 15th IEEE International Conference on Data Mining, ICDM 2015
A2 - Aggarwal, Charu
A2 - Zhou, Zhi-Hua
A2 - Tuzhilin, Alexander
A2 - Xiong, Hui
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 14 November 2015 through 17 November 2015
ER -