Skip to main navigation Skip to search Skip to main content

A Retrieval-Augmented Framework for Tabular Interpretation with Large Language Model

  • Mengyi Yan
  • , Weilong Ren*
  • , Yaoshu Wang
  • , Jianxin Li*
  • *Corresponding author for this work
  • Beihang University
  • Shenzhen Institute of Computing Sciences

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Relational tables on the web hold a vast amount of knowledge, and it is critical for machine learning models to capture the semantics of these tables such that the models can achieve good performance on table interpretation tasks, such as entity linking, column type annotation and relation extraction. However, it is very challenging for ML models to process a large amount of tables and/or retrieve inter-table context information from the tables. Instead, existing works usually rely on heavily engineered features, user-defined rules or pre-training corpus. In this work, we propose a unified Retrieval-Augmented Framework for tabular interpretation with Large language model (RAFL), a novel 2-step framework for addressing the table interpretation task. RAFL first adopts a graph-enhanced model to obtain the inter-table context information by retrieving schema-similar and topic-relevant tables from a large range of corpus; RAFL then conducts tabular interpretation learning by combining a light-weighted pre-ranking model with a re-ranking-based large language model. We verify the effectiveness of RAFL through extensive evaluations on 3 tabular interpretation tasks (including entity linking, column type annotation and relation extraction), where RAFL substantially outperforms existing methods on all tasks.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Proceedings
EditorsMakoto Onizuka, Jae-Gil Lee, Yongxin Tong, Chuan Xiao, Yoshiharu Ishikawa, Kejing Lu, Sihem Amer-Yahia, H.V. Jagadish
PublisherSpringer Science and Business Media Deutschland GmbH
Pages341-356
Number of pages16
ISBN (Print)9789819757787
DOIs
StatePublished - 2025
Event29th International Conference on Database Systems for Advanced Applications, DASFAA 2024 - Gifu, Japan
Duration: 2 Jul 20245 Jul 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14851 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on Database Systems for Advanced Applications, DASFAA 2024
Country/TerritoryJapan
CityGifu
Period2/07/245/07/24

Fingerprint

Dive into the research topics of 'A Retrieval-Augmented Framework for Tabular Interpretation with Large Language Model'. Together they form a unique fingerprint.

Cite this