跳到主要导航 跳到搜索 跳到主要内容

A fuzzy deterministic policy gradient algorithm for pursuit-evasion differential games

  • Beihang University

科研成果: 期刊稿件文章同行评审

摘要

Fuzzy inference systems with reinforcement learning are currently being used in differential games to train agents with no prior experience. However, the reinforcement learning algorithms based on actor-critic structure have a drawback that the policy is depended on a probability distribution. In this paper, a novel fuzzy deterministic policy gradient algorithm is introduced and applied to classical 1-vs-1 constant-velocity pursuit-evasion differential games. The key goal is to self-learn the optimal strategy in the continuous action domain and obtain a specific physical meaning of the fuzzy rules. The novel proposed algorithm is based on the deterministic policy gradient theorem and the agent learns the near-optimal strategy under the actor-critic structure. The fuzzy inference system is applied as approximators so that the specific physical meaning can be obtained by the linguistic fuzzy rules. Furthermore, the proposed algorithm is applied to solve the decision-making problem of pursuit-evasion differential games. The result is compared with other existing algorithms and it elucidates that the proposed algorithm outperforms the precision and convergence efficiency.

源语言英语
页(从-至)106-117
页数12
期刊Neurocomputing
362
DOI
出版状态已出版 - 14 10月 2019

指纹

探究 'A fuzzy deterministic policy gradient algorithm for pursuit-evasion differential games' 的科研主题。它们共同构成独一无二的指纹。

引用此