TY - JOUR
T1 - Multi-AUV Pursuit-Evasion Game in the Internet of Underwater Things
T2 - An Efficient Training Framework via Offline Reinforcement Learning
AU - Xu, Jingzehua
AU - Zhang, Zekai
AU - Wang, Jingjing
AU - Han, Zhu
AU - Ren, Yong
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2024
Y1 - 2024
N2 - In this article, we investigate the pursuit-evasion game of multiple autonomous underwater vehicles (AUVs) in a complex ocean environment. The pursuer AUVs need to optimize their trajectories to avoid obstacles and dangerous vortex regions in the environment in order to pursue the escaper AUV. Both the pursuer and escaper can sense each other with limited detection capabilities for further pursuit or escape. As the underwater pursuit-evasion (UPE) game is a high-dimensional NP-hard problem, we innovatively transform it into a finite-horizon Markov game process and propose a decentralized training and decentralized execution efficient training framework based on the offline reinforcement learning. During the training process, we propose multiagent independent soft actor-critic to facilitate policy improvement and generate the offline data set, and propose multiagent independent decision transformer for model training in the UPE game. Extensive simulations demonstrate the scalability and generalization ability of our proposed training framework, which can achieve excellent performance in the UPE games under different conditions and environments with only a few AUVs participating in policy improvement to generate the high-quality offline data set.
AB - In this article, we investigate the pursuit-evasion game of multiple autonomous underwater vehicles (AUVs) in a complex ocean environment. The pursuer AUVs need to optimize their trajectories to avoid obstacles and dangerous vortex regions in the environment in order to pursue the escaper AUV. Both the pursuer and escaper can sense each other with limited detection capabilities for further pursuit or escape. As the underwater pursuit-evasion (UPE) game is a high-dimensional NP-hard problem, we innovatively transform it into a finite-horizon Markov game process and propose a decentralized training and decentralized execution efficient training framework based on the offline reinforcement learning. During the training process, we propose multiagent independent soft actor-critic to facilitate policy improvement and generate the offline data set, and propose multiagent independent decision transformer for model training in the UPE game. Extensive simulations demonstrate the scalability and generalization ability of our proposed training framework, which can achieve excellent performance in the UPE games under different conditions and environments with only a few AUVs participating in policy improvement to generate the high-quality offline data set.
KW - Autonomous underwater vehicle (AUV)
KW - decision transformer (DT)
KW - finite-horizon Markov game process (FMGP)
KW - offline reinforcement learning (ORL)
KW - pursuit-evasion game
UR - https://www.scopus.com/pages/publications/85196558251
U2 - 10.1109/JIOT.2024.3416616
DO - 10.1109/JIOT.2024.3416616
M3 - 文章
AN - SCOPUS:85196558251
SN - 2327-4662
VL - 11
SP - 31273
EP - 31286
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 19
ER -