Skip to main navigation Skip to search Skip to main content

Offline Reinforcement Learning with Constrained Hybrid Action Implicit Representation Towards Wargaming Decision-Making

  • Beihang University
  • Zhongguancun Laboratory

Research output: Contribution to journalArticlepeer-review

Abstract

Reinforcement Learning (RL) has emerged as a promising data-driven solution for wargaming decision-making. However, two domain challenges still exist: (1) dealing with discrete-continuous hybrid wargaming control and (2) accelerating RL deployment with rich offline data. Existing RL methods fail to handle these two issues simultaneously, thereby we propose a novel offline RL method targeting hybrid action space. A new constrained action representation technique is developed to build a bidirectional mapping between the original hybrid action space and a latent space in a semantically consistent way. This allows learning a continuous latent policy with offline RL with better exploration feasibility and scalability and reconstructing it back to a needed hybrid policy. Critically, a novel offline RL optimization objective with adaptively adjusted constraints is designed to balance the alleviation and generalization of out-of-distribution actions. Our method demonstrates superior performance and generality across different tasks, particularly in typical realistic wargaming scenarios.

Original languageEnglish
Pages (from-to)1422-1440
Number of pages19
JournalTsinghua Science and Technology
Volume29
Issue number5
DOIs
StatePublished - 1 Oct 2024

Keywords

  • decision-making
  • hybrid action space
  • offline Reinforcement Learning (RL)
  • wargaming

Fingerprint

Dive into the research topics of 'Offline Reinforcement Learning with Constrained Hybrid Action Implicit Representation Towards Wargaming Decision-Making'. Together they form a unique fingerprint.

Cite this