Skip to main navigation Skip to search Skip to main content

A fuzzy deterministic policy gradient algorithm for pursuit-evasion differential games

  • Beihang University

Research output: Contribution to journalArticlepeer-review

Abstract

Fuzzy inference systems with reinforcement learning are currently being used in differential games to train agents with no prior experience. However, the reinforcement learning algorithms based on actor-critic structure have a drawback that the policy is depended on a probability distribution. In this paper, a novel fuzzy deterministic policy gradient algorithm is introduced and applied to classical 1-vs-1 constant-velocity pursuit-evasion differential games. The key goal is to self-learn the optimal strategy in the continuous action domain and obtain a specific physical meaning of the fuzzy rules. The novel proposed algorithm is based on the deterministic policy gradient theorem and the agent learns the near-optimal strategy under the actor-critic structure. The fuzzy inference system is applied as approximators so that the specific physical meaning can be obtained by the linguistic fuzzy rules. Furthermore, the proposed algorithm is applied to solve the decision-making problem of pursuit-evasion differential games. The result is compared with other existing algorithms and it elucidates that the proposed algorithm outperforms the precision and convergence efficiency.

Original languageEnglish
Pages (from-to)106-117
Number of pages12
JournalNeurocomputing
Volume362
DOIs
StatePublished - 14 Oct 2019

Keywords

  • Deterministic policy gradient
  • Differential game
  • Fuzzy inference system
  • Fuzzy logic
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'A fuzzy deterministic policy gradient algorithm for pursuit-evasion differential games'. Together they form a unique fingerprint.

Cite this