Skip to main navigation Skip to search Skip to main content

Cooperative Allocation of Multi-Domain Heterogeneous Aerial Vehicles Via CASA: A Counterfactual Multi-Agent Reinforcement Learning Model

  • Yaxuan Fu
  • , Jia Song*
  • , Xindi Tong
  • *Corresponding author for this work
  • Beihang University
  • State Key Laboratory of High-Efficiency Reusable Aerospace Transportation Technology

Research output: Contribution to journalArticlepeer-review

Abstract

To address the cooperative allocation problem among heterogeneous multi-domain UAVs, this paper proposes a counterfactual multi-agent reinforcement learning method named CASA (Counterfactual Allocation for Swarm Agents). Traditional allocation models struggle to cope with the complexity introduced by heterogeneous platforms, while existing deep reinforcement learning approaches in multi-agent environments often suffer from partial observability, environmental non-stationarity, and ambiguous credit assignment. To overcome these challenges, CASA integrates counterfactual multi-agent policy gradients with a centralized-training and decentralized-execution (CTDE) framework, where a counterfactual baseline enables precise credit assignment and enhances cross-platform coordination. The model introduces a state representation and task formulation that accommodate heterogeneous platform characteristics, and adopts a hierarchical reward structure to jointly optimize mission completion rate, resource consumption, and success probability, allowing the policy to balance risk, cost, and performance. This method is applicable to static or quasi-static multi-target allocation scenarios, and can be extended to cross-platform swarm coordination, UAV formation decision-making and emergency response resource scheduling. Simulation results demonstrate that CASA achieves superior allocation performance with stable convergence across multiple problem scales compared with other baseline methods, highlighting its effectiveness in complex multi-agent cooperative allocation scenarios.

Original languageEnglish
JournalAdvances in Astronautics
DOIs
StateAccepted/In press - 2026

Keywords

  • Counterfactual advantage
  • Heterogeneous platforms
  • Multi-agent reinforcement learning
  • Target assignment

Fingerprint

Dive into the research topics of 'Cooperative Allocation of Multi-Domain Heterogeneous Aerial Vehicles Via CASA: A Counterfactual Multi-Agent Reinforcement Learning Model'. Together they form a unique fingerprint.

Cite this