TY - JOUR
T1 - Exploiting relevance, coverage, and novelty for query-focused multi-document summarization
AU - Luo, Wenjuan
AU - Zhuang, Fuzhen
AU - He, Qing
AU - Shi, Zhongzhi
PY - 2013/7
Y1 - 2013/7
N2 - Summarization plays an increasingly important role with the exponential document growth on the Web. Specifically, for query-focused summarization, there exist three challenges: (1) how to retrieve query relevant sentences; (2) how to concisely cover the main aspects (i.e., topics) in the document; and (3) how to balance these two requests. Specially for the issue relevance, many traditional summarization techniques assume that there is independent relevance between sentences, which may not hold in reality. In this paper, we go beyond this assumption and propose a novel Probabilistic-modeling Relevance, Coverage, and Novelty (PRCN) framework, which exploits a reference topic model incorporating user query for dependent relevance measurement. Along this line, topic coverage is also modeled under our framework. To further address the issues above, various sentence features regarding relevance and novelty are constructed as features, while moderate topic coverage are maintained through a greedy algorithm for topic balance. Finally, experiments on DUC2005 and DUC2006 datasets validate the effectiveness of the proposed method.
AB - Summarization plays an increasingly important role with the exponential document growth on the Web. Specifically, for query-focused summarization, there exist three challenges: (1) how to retrieve query relevant sentences; (2) how to concisely cover the main aspects (i.e., topics) in the document; and (3) how to balance these two requests. Specially for the issue relevance, many traditional summarization techniques assume that there is independent relevance between sentences, which may not hold in reality. In this paper, we go beyond this assumption and propose a novel Probabilistic-modeling Relevance, Coverage, and Novelty (PRCN) framework, which exploits a reference topic model incorporating user query for dependent relevance measurement. Along this line, topic coverage is also modeled under our framework. To further address the issues above, various sentence features regarding relevance and novelty are constructed as features, while moderate topic coverage are maintained through a greedy algorithm for topic balance. Finally, experiments on DUC2005 and DUC2006 datasets validate the effectiveness of the proposed method.
KW - Coverage
KW - Dependent relevance
KW - Novelty
KW - PHITS
KW - Query-focused document summarization
UR - https://www.scopus.com/pages/publications/84877582962
U2 - 10.1016/j.knosys.2013.02.015
DO - 10.1016/j.knosys.2013.02.015
M3 - 文章
AN - SCOPUS:84877582962
SN - 0950-7051
VL - 46
SP - 33
EP - 42
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
ER -