TY - GEN
T1 - Multi-UAV Path Planning Based on Multi-Agent Deep Reinforcement Learning
AU - Sun, Zeyang
AU - Luo, Xiling
AU - Ji, Xiaohai
AU - Zhao, Wang
AU - Lei, Yue
AU - Bai, Gengyi
N1 - Publisher Copyright:
© 2025 The Authors.
PY - 2025/7/17
Y1 - 2025/7/17
N2 - As the widespread adoption of unmanned aerial vehicles (UAVs) in various fields, the UAVs path planning serves as a prerequisite for ensuring the safety and efficiency of the flight. Existing methods rely on grid airspace and detailed modeling for a specific environment, limiting portability and inflexibility responding to diverse tasks and environments. They utilize discrete actions space, neglecting the dynamics characteristics of UAVs, which leads to planning relatively rough paths. Furthermore, the current Studies lack sufficient consideration of reliance on spatiotemporal data and disregard multi-UAV collisions and unstable training environments, which hinder the realization of collaboration among UAVs. In response to these challenges, this paper propose an Enhanced Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (Enhanced-MATD3). We model a state space according to the 3D scene and design a continuous action space. Further, we devise reward function based on obstacle and collision avoidance principles. By enhancing the actor and critic network from both temporal and spatial dimensions, we broaden UAVs perception range and sharpen their responsiveness to dynamic environmental changes. Our work use different scenarios during the training process, and the experimental results demonstrate the advantages of our algorithm in terms of average path length, planning time, and convergence speed over the baseline.
AB - As the widespread adoption of unmanned aerial vehicles (UAVs) in various fields, the UAVs path planning serves as a prerequisite for ensuring the safety and efficiency of the flight. Existing methods rely on grid airspace and detailed modeling for a specific environment, limiting portability and inflexibility responding to diverse tasks and environments. They utilize discrete actions space, neglecting the dynamics characteristics of UAVs, which leads to planning relatively rough paths. Furthermore, the current Studies lack sufficient consideration of reliance on spatiotemporal data and disregard multi-UAV collisions and unstable training environments, which hinder the realization of collaboration among UAVs. In response to these challenges, this paper propose an Enhanced Multi-Agent Twin Delayed Deep Deterministic Policy Gradient (Enhanced-MATD3). We model a state space according to the 3D scene and design a continuous action space. Further, we devise reward function based on obstacle and collision avoidance principles. By enhancing the actor and critic network from both temporal and spatial dimensions, we broaden UAVs perception range and sharpen their responsiveness to dynamic environmental changes. Our work use different scenarios during the training process, and the experimental results demonstrate the advantages of our algorithm in terms of average path length, planning time, and convergence speed over the baseline.
KW - Enhanced-MATD3
KW - Multi-Agent deep reinforcement learning
KW - low-Altitude airspace
KW - path planning
UR - https://www.scopus.com/pages/publications/105016843392
U2 - 10.3233/ATDE250488
DO - 10.3233/ATDE250488
M3 - 会议稿件
AN - SCOPUS:105016843392
T3 - Advances in Transdisciplinary Engineering
SP - 882
EP - 894
BT - Intelligent Transportation Engineering - Proceedings of the 9th International Conference, ICITE 2024
A2 - Mao, Guoqiang
PB - IOS Press BV
T2 - 9th International Conference on Intelligent Transportation Engineering, ICITE 2024
Y2 - 18 October 2024 through 20 October 2024
ER -