TY - GEN
T1 - RA3
T2 - 40th IEEE International Conference on Data Engineering, ICDE 2024
AU - Chai, Lei
AU - Qi, Lu
AU - Sun, Hailong
AU - Li, Jingzheng
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Interpreting model behavior is crucial for model evaluation and optimization. Recent research demonstrates that incorporating human intelligence into the learning process effectively improve the interpretability and performance of the machine learning models, especially for simple classification tasks. However, the image captioning task has not received much attention. Such complex sequential tasks generally contain semantic relationships between different concepts, which pose challenges for interpreting model behavior and developing optimization methods. In this paper, we present RA 3 (Relation-Aware Attribution Analysis), a human-in-the-loop framework, for improving the interpretability, and further boosting the performance of the image captioning model. Specifically, we first engage human participants in two types of annotation tasks to identify what the model actually focuses on (model attribution) and what it should focus on (human rationale) at the conceptual level, supported by machine learning interpretability methods. Then, we identify and filter hard instances based on relation-aware model attribution for both validating the quality of the explanation and eliminating low-quality captions (this process is also considered as a kind of data debugging). We subsequently designed an explanation loss that penalizes the difference between model attribution and human rationale to optimize the model's behavior for improving caption quality. Through extensive experiments on crowdsourced annotations and MSCOCO, the experiment results indicate that the explanations produced by RA3 can accurately describe the model's behavior, effectively identify difficult instances, and significantly improve the caption quality.
AB - Interpreting model behavior is crucial for model evaluation and optimization. Recent research demonstrates that incorporating human intelligence into the learning process effectively improve the interpretability and performance of the machine learning models, especially for simple classification tasks. However, the image captioning task has not received much attention. Such complex sequential tasks generally contain semantic relationships between different concepts, which pose challenges for interpreting model behavior and developing optimization methods. In this paper, we present RA 3 (Relation-Aware Attribution Analysis), a human-in-the-loop framework, for improving the interpretability, and further boosting the performance of the image captioning model. Specifically, we first engage human participants in two types of annotation tasks to identify what the model actually focuses on (model attribution) and what it should focus on (human rationale) at the conceptual level, supported by machine learning interpretability methods. Then, we identify and filter hard instances based on relation-aware model attribution for both validating the quality of the explanation and eliminating low-quality captions (this process is also considered as a kind of data debugging). We subsequently designed an explanation loss that penalizes the difference between model attribution and human rationale to optimize the model's behavior for improving caption quality. Through extensive experiments on crowdsourced annotations and MSCOCO, the experiment results indicate that the explanations produced by RA3 can accurately describe the model's behavior, effectively identify difficult instances, and significantly improve the caption quality.
KW - Crowdsourcing
KW - Explanation-guided learning
KW - Human-centered explainable AI (HCXAI)
KW - Image captioning
KW - human-in-the-loop
UR - https://www.scopus.com/pages/publications/85200485270
U2 - 10.1109/ICDE60146.2024.00032
DO - 10.1109/ICDE60146.2024.00032
M3 - 会议稿件
AN - SCOPUS:85200485270
T3 - Proceedings - International Conference on Data Engineering
SP - 330
EP - 341
BT - Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PB - IEEE Computer Society
Y2 - 13 May 2024 through 17 May 2024
ER -