跳到主要导航 跳到搜索 跳到主要内容

An Efficient MADDPG with Episode-Parallel Interaction and Dual Priority Experience Replay

  • Ping Zhou
  • , Hui Lu*
  • *此作品的通讯作者
  • Beihang University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Multi-agent Deep Deterministic Policy Gradient (MADDPG) is a common multi-agent deep reinforcement learning algorithm applied in both cooperative and competitive scenarios. However, the frequent interactions with the environment and indiscriminate sampling for training models will lead to poor training efficiency and low convergence performance. To overcome above limitations, this paper proposes an efficient MADDPG with episode-parallel interaction and dual priority experience replay (EIDPER-MADDPG), which can achieve a better convergence performance in a shorter training time. Firstly, we devise a parallel interaction architecture to utilize multiple processes for collecting experiences and learning from them repeatedly in one sampling. Secondly, considering the contributions of samples from two perspectives in model training and task scenarios, we redesign a dual priority experience replay for evaluating samples’ importance, which provides more valuable samples for training and enhances the convergence performance. Furthermore, we conduct simulations to demonstrate the effectiveness of the proposed algorithm in terms of training efficiency and convergence performance.

源语言英语
主期刊名Proceedings of 2023 7th Chinese Conference on Swarm Intelligence and Cooperative Control - Swarm Decision and Planning Technologies
编辑Xiaoduo Li, Xun Song, Yingjiang Zhou
出版商Springer Science and Business Media Deutschland GmbH
527-538
页数12
ISBN(印刷版)9789819733354
DOI
出版状态已出版 - 2024
活动7th Chinese Conference on Swarm Intelligence and Cooperative Control, CCSICC 2023 - Nanjing, 中国
期限: 24 11月 202327 11月 2023

出版系列

姓名Lecture Notes in Electrical Engineering
1207 LNEE
ISSN(印刷版)1876-1100
ISSN(电子版)1876-1119

会议

会议7th Chinese Conference on Swarm Intelligence and Cooperative Control, CCSICC 2023
国家/地区中国
Nanjing
时期24/11/2327/11/23

指纹

探究 'An Efficient MADDPG with Episode-Parallel Interaction and Dual Priority Experience Replay' 的科研主题。它们共同构成独一无二的指纹。

引用此