跳到主要导航 跳到搜索 跳到主要内容

RA3: A Human-in-the-loop Framework for Interpreting and Improving Image Captioning with Relation-Aware Attribution Analysis

  • Lei Chai
  • , Lu Qi
  • , Hailong Sun*
  • , Jingzheng Li
  • *此作品的通讯作者
  • Beihang University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Interpreting model behavior is crucial for model evaluation and optimization. Recent research demonstrates that incorporating human intelligence into the learning process effectively improve the interpretability and performance of the machine learning models, especially for simple classification tasks. However, the image captioning task has not received much attention. Such complex sequential tasks generally contain semantic relationships between different concepts, which pose challenges for interpreting model behavior and developing optimization methods. In this paper, we present RA 3 (Relation-Aware Attribution Analysis), a human-in-the-loop framework, for improving the interpretability, and further boosting the performance of the image captioning model. Specifically, we first engage human participants in two types of annotation tasks to identify what the model actually focuses on (model attribution) and what it should focus on (human rationale) at the conceptual level, supported by machine learning interpretability methods. Then, we identify and filter hard instances based on relation-aware model attribution for both validating the quality of the explanation and eliminating low-quality captions (this process is also considered as a kind of data debugging). We subsequently designed an explanation loss that penalizes the difference between model attribution and human rationale to optimize the model's behavior for improving caption quality. Through extensive experiments on crowdsourced annotations and MSCOCO, the experiment results indicate that the explanations produced by RA3 can accurately describe the model's behavior, effectively identify difficult instances, and significantly improve the caption quality.

源语言英语
主期刊名Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
出版商IEEE Computer Society
330-341
页数12
ISBN(电子版)9798350317152
DOI
出版状态已出版 - 2024
活动40th IEEE International Conference on Data Engineering, ICDE 2024 - Utrecht, 荷兰
期限: 13 5月 202417 5月 2024

出版系列

姓名Proceedings - International Conference on Data Engineering
ISSN(印刷版)1084-4627
ISSN(电子版)2375-0286

会议

会议40th IEEE International Conference on Data Engineering, ICDE 2024
国家/地区荷兰
Utrecht
时期13/05/2417/05/24

指纹

探究 'RA3: A Human-in-the-loop Framework for Interpreting and Improving Image Captioning with Relation-Aware Attribution Analysis' 的科研主题。它们共同构成独一无二的指纹。

引用此