TY - GEN
T1 - An improved relation-based information retrieval technique for bioinformatics
AU - Li, Yan
AU - Wen, Jian
AU - Li, Zhoujun
PY - 2008
Y1 - 2008
N2 - One of the limitations with the current relationship-based IR models is that a relation is often recorded as a binary form, such as R(Term1, Term2), which is only composed of general information of a pair of two terms which are semantically and syntactically related to each other. To tackle this problem, a triple is defined in this paper as a data structure for the integration of a pair of concepts as well as a verb phrase or sometimes a special noun we extract from the sentence as the relation of the concepts pair. We applied the advanced ontology-based approach to extract generic concepts and relations by using both UMLS and WordNet, and implemented a new approach to rank retrieved passages from documents corresponding to measuring system performance mentioned in TREC 2007 Genomics Track. We built a new version (IRIRS) of the relation-based FR system (RIRS) developed by DM & Bioinformatics Lab of Drexel University in 2004. We use IRIRS to search answers in tests of English reading comprehension and improve the retrieval result of all official runs in TREC 2004 Genomics Track. The experiments which are based on the different collections show more promising performance of IRIRS than RIRS. The character-based MAP measuring passage-level retrieval performance, for 64 topics from the first collection is significantly raised from 64.44 % (RIRS) to 74.28%. The MAP (Mean Average Precision) for 50 topics from the second collection is raised from 21.71% (TREC) and 37.58% (RIRS) to 40.14%.
AB - One of the limitations with the current relationship-based IR models is that a relation is often recorded as a binary form, such as R(Term1, Term2), which is only composed of general information of a pair of two terms which are semantically and syntactically related to each other. To tackle this problem, a triple is defined in this paper as a data structure for the integration of a pair of concepts as well as a verb phrase or sometimes a special noun we extract from the sentence as the relation of the concepts pair. We applied the advanced ontology-based approach to extract generic concepts and relations by using both UMLS and WordNet, and implemented a new approach to rank retrieved passages from documents corresponding to measuring system performance mentioned in TREC 2007 Genomics Track. We built a new version (IRIRS) of the relation-based FR system (RIRS) developed by DM & Bioinformatics Lab of Drexel University in 2004. We use IRIRS to search answers in tests of English reading comprehension and improve the retrieval result of all official runs in TREC 2004 Genomics Track. The experiments which are based on the different collections show more promising performance of IRIRS than RIRS. The character-based MAP measuring passage-level retrieval performance, for 64 topics from the first collection is significantly raised from 64.44 % (RIRS) to 74.28%. The MAP (Mean Average Precision) for 50 topics from the second collection is raised from 21.71% (TREC) and 37.58% (RIRS) to 40.14%.
UR - https://www.scopus.com/pages/publications/54249137898
U2 - 10.1109/ICINFA.2008.4608247
DO - 10.1109/ICINFA.2008.4608247
M3 - 会议稿件
AN - SCOPUS:54249137898
SN - 9781424421848
T3 - Proceedings of the 2008 IEEE International Conference on Information and Automation, ICIA 2008
SP - 1536
EP - 1541
BT - Proceedings of the 2008 IEEE International Conference on Information and Automation, ICIA 2008
T2 - 2008 IEEE International Conference on Information and Automation, ICIA 2008
Y2 - 20 June 2008 through 23 June 2008
ER -