Unbiased diagnostic report generation via multi-modal counterfactual inference

Research output: Contribution to journalArticlepeer-review

Abstract

Automated diagnostic report generation is a challenging vision-and-language bridging task aimed at accurately describing medical images and performing cross-modal causal inference. Despite its significant clinical importance, widespread application remains challenging. Existing methods often rely on pre-trained models with large-scale medical report datasets, leading to data shifts between training and testing sets, resulting in irrelevant contextual biases in the visual domain and correlation biases within the knowledge graph. To address these issues, we propose a novel multimodal causal inference approach called Multimodal Counterfactual Unbiased Report Generation (MCURG), which incorporates causal inference to exploit invariant rationales. Our key innovation lies in leveraging counterfactual inference to reduce visual and knowledge biases. MCURG employs a Structural Causal Model (SCM) to elucidate the complex relationships among images, knowledge graphs, reports, confounders, and personalized features. We design two multimodal debiasing modules: a visual debiasing module and a knowledge graph debiasing module. The visual debiasing module focuses on the Total Direct Effect of image features, mitigating confounding factors, while the knowledge graph debiasing module identifies individualized treatments within the graph, reducing spurious generations. We conducted extensive experiments and comprehensive evaluations on multiple datasets, demonstrating that MCURG effectively reduces bias and improves the accuracy of generated reports. This multimodal causal inference approach, through the use of SCM and counterfactual reasoning, successfully addresses bias in automated diagnostic report generation, marking a significant innovation in the field. The codes are available at https://github.com/stellating/MCURG.

Original languageEnglish
Article number109639
JournalBiomedical Signal Processing and Control
Volume119
DOIs
StatePublished - 15 Jun 2026

Keywords

  • Artificial intelligence
  • Causal inference
  • Medical report generation
  • Multi-modal learning
  • Unbiased

Fingerprint

Dive into the research topics of 'Unbiased diagnostic report generation via multi-modal counterfactual inference'. Together they form a unique fingerprint.

Cite this