TY - JOUR
T1 - Attitude control for hypersonic reentry vehicles
T2 - An efficient deep reinforcement learning method
AU - Liu, Yiheng
AU - Wang, Honglun
AU - Wu, Tiancai
AU - Lun, Yuebin
AU - Fan, Jiaxuan
AU - Wu, Jianfa
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/7
Y1 - 2022/7
N2 - Aiming at the attitude control problem of hypersonic reentry vehicles (HRVs), a deep reinforcement learning (DRL) based anti-disturbance control method is proposed. First, a compound control framework consisting of a DRL-based auxiliary controller and a fixed-time anti-disturbance controller is proposed to improve the control performance under the premise of ensuring stability. Then, a novel value function approximation mechanism, named experience-based value expansion (EVE), is proposed to modify the value function update equation based on a two-dimensional replay buffer, which solves the DRL convergence problem brought by the HRV's strong nonlinearities, tight coupling, and big flight envelope. Furthermore, a result-oriented encoder (ROE) is proposed to solve the DRL generalization problem brought by the HRV's high uncertainties and unavailable real training environment. A bottleneck shape neural network structure is used for the DRL's network structure to extract high-dimensional features and prevent overfitting to the training environment. Finally, abundant numerical comparative simulations demonstrate the effectiveness of the proposed efficient DRL algorithms and the DRL-based attitude controller.
AB - Aiming at the attitude control problem of hypersonic reentry vehicles (HRVs), a deep reinforcement learning (DRL) based anti-disturbance control method is proposed. First, a compound control framework consisting of a DRL-based auxiliary controller and a fixed-time anti-disturbance controller is proposed to improve the control performance under the premise of ensuring stability. Then, a novel value function approximation mechanism, named experience-based value expansion (EVE), is proposed to modify the value function update equation based on a two-dimensional replay buffer, which solves the DRL convergence problem brought by the HRV's strong nonlinearities, tight coupling, and big flight envelope. Furthermore, a result-oriented encoder (ROE) is proposed to solve the DRL generalization problem brought by the HRV's high uncertainties and unavailable real training environment. A bottleneck shape neural network structure is used for the DRL's network structure to extract high-dimensional features and prevent overfitting to the training environment. Finally, abundant numerical comparative simulations demonstrate the effectiveness of the proposed efficient DRL algorithms and the DRL-based attitude controller.
KW - Anti-disturbance control
KW - Attitude control
KW - Deep reinforcement learning
KW - Hypersonic reentry vehicle
KW - Value function approximation
UR - https://www.scopus.com/pages/publications/85129957420
U2 - 10.1016/j.asoc.2022.108865
DO - 10.1016/j.asoc.2022.108865
M3 - 文章
AN - SCOPUS:85129957420
SN - 1568-4946
VL - 123
JO - Applied Soft Computing
JF - Applied Soft Computing
M1 - 108865
ER -