跳到主要导航 跳到搜索 跳到主要内容

A classification approach to coreference in discharge summaries: 2011 i2b2 challenge

  • Yan Xu
  • , Jiahua Liu
  • , Jiajun Wu
  • , Yue Wang
  • , Zhuowen Tu
  • , Jian Tao Sun
  • , Junichi Tsujii
  • , Eric I.Chao Chang*
  • *此作品的通讯作者
  • Microsoft USA
  • Tsinghua University
  • Shanghai Jiao Tong University
  • University of California at Los Angeles

科研成果: 期刊稿件文章同行评审

摘要

Objective To create a highly accurate coreference system in discharge summaries for the 2011 i2b2 challenge. The coreference categories include Person, Problem, Treatment, and Test. Design An integrated coreference resolution system was developed by exploiting Person attributes, contextual semantic clues, and world knowledge. It includes three subsystems: Person coreference system based on three Person attributes, Problem/Treatment/ Test system based on numerous contextual semantic extractors and world knowledge, and Pronoun system based on a multi-class support vector machine classifier. The three Person attributes are patient, relative and hospital personnel. Contextual semantic extractors include anatomy, position, medication, indicator, temporal, spatial, section, modifier, equipment, operation, and assertion. The world knowledge is extracted from external resources such as Wikipedia. Measurements Micro-averaged precision, recall and Fmeasure in MUC, BCubed and CEAF were used to evaluate results. Results The system achieved an overall micro-averaged precision, recall and F-measure of 0.906, 0.925, and 0.915, respectively, on test data (from four hospitals) released by the challenge organizers. It achieved a precision, recall and F-measure of 0.905, 0.920 and 0.913, respectively, on test data without Pittsburgh data. We ranked the first out of 20 competing teams. Among the four sub-tasks on Person, Problem, Treatment, and Test, the highest F-measure was seen for Person coreference. Conclusions This system achieved encouraging results. The Person system can determine whether personal pronouns and proper names are coreferent or not. The Problem/Treatment/Test system benefits from both world knowledge in evaluating the similarity of two mentions and contextual semantic extractors in identifying semantic clues. The Pronoun system can automatically detect whether a Pronoun mention is coreferent to that of the other four types. This study demonstrates that it is feasible to accomplish the coreference task in discharge summaries.

源语言英语
页(从-至)897-905
页数9
期刊Journal of the American Medical Informatics Association
19
5
DOI
出版状态已出版 - 9月 2012

指纹

探究 'A classification approach to coreference in discharge summaries: 2011 i2b2 challenge' 的科研主题。它们共同构成独一无二的指纹。

引用此