跳到主要导航 跳到搜索 跳到主要内容

A Retrieval-Augmented Framework for Tabular Interpretation with Large Language Model

  • Mengyi Yan
  • , Weilong Ren*
  • , Yaoshu Wang
  • , Jianxin Li*
  • *此作品的通讯作者
  • Beihang University
  • Shenzhen Institute of Computing Sciences

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Relational tables on the web hold a vast amount of knowledge, and it is critical for machine learning models to capture the semantics of these tables such that the models can achieve good performance on table interpretation tasks, such as entity linking, column type annotation and relation extraction. However, it is very challenging for ML models to process a large amount of tables and/or retrieve inter-table context information from the tables. Instead, existing works usually rely on heavily engineered features, user-defined rules or pre-training corpus. In this work, we propose a unified Retrieval-Augmented Framework for tabular interpretation with Large language model (RAFL), a novel 2-step framework for addressing the table interpretation task. RAFL first adopts a graph-enhanced model to obtain the inter-table context information by retrieving schema-similar and topic-relevant tables from a large range of corpus; RAFL then conducts tabular interpretation learning by combining a light-weighted pre-ranking model with a re-ranking-based large language model. We verify the effectiveness of RAFL through extensive evaluations on 3 tabular interpretation tasks (including entity linking, column type annotation and relation extraction), where RAFL substantially outperforms existing methods on all tasks.

源语言英语
主期刊名Database Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Proceedings
编辑Makoto Onizuka, Jae-Gil Lee, Yongxin Tong, Chuan Xiao, Yoshiharu Ishikawa, Kejing Lu, Sihem Amer-Yahia, H.V. Jagadish
出版商Springer Science and Business Media Deutschland GmbH
341-356
页数16
ISBN(印刷版)9789819757787
DOI
出版状态已出版 - 2025
活动29th International Conference on Database Systems for Advanced Applications, DASFAA 2024 - Gifu, 日本
期限: 2 7月 20245 7月 2024

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
14851 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议29th International Conference on Database Systems for Advanced Applications, DASFAA 2024
国家/地区日本
Gifu
时期2/07/245/07/24

指纹

探究 'A Retrieval-Augmented Framework for Tabular Interpretation with Large Language Model' 的科研主题。它们共同构成独一无二的指纹。

引用此