MA-HRL: Multi-Agent Hierarchical Reinforcement Learning for Medical Diagnostic Dialogue Systems

  • Xingchuang Liao
  • , Yuchen Qin
  • , Zhimin Fan
  • , Xiaoming Yu
  • , Jingbo Yang
  • , Rongye Shi
  • , Wenjun Wu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Task-oriented medical dialogue systems face two fundamental challenges: the explosion of state-action space caused by numerous diseases and symptoms and the sparsity of informative signals during interactive diagnosis. These issues significantly hinder the accuracy and efficiency of automated clinical reasoning. To address these problems, we propose MA-HRL, a multi-agent hierarchical reinforcement learning framework that decomposes the diagnostic task into specialized agents. A high-level controller coordinates symptom inquiry via multiple worker agents, each targeting a specific disease group, while a two-tier disease classifier refines diagnostic decisions through hierarchical probability reasoning. To combat sparse rewards, we design an information entropy-based reward function that encourages agents to acquire maximally informative symptoms. Additionally, medical knowledge graphs are integrated to guide decision-making and improve dialogue coherence. Experiments on the SymCat-derived SD dataset demonstrate that MA-HRL achieves substantial improvements over state-of-the-art baselines, including +7.2% diagnosis accuracy, +0.91% symptom hit rate, and +15.94% symptom recognition rate. Ablation studies further verify the effectiveness of each module. This work highlights the potential of hierarchical, knowledge-aware multi-agent systems for interpretable and scalable medical diagnosis.

Original languageEnglish
Article number3001
JournalElectronics (Switzerland)
Volume14
Issue number15
DOIs
StatePublished - Aug 2025

Keywords

  • information entropy reward
  • knowledge graph
  • medical diagnostic dialogue systems
  • multi-agent reinforcement learning

Fingerprint

Dive into the research topics of 'MA-HRL: Multi-Agent Hierarchical Reinforcement Learning for Medical Diagnostic Dialogue Systems'. Together they form a unique fingerprint.

Cite this