TY - JOUR
T1 - A dynamic multi-task selective execution policy considering stochastic dependence between degradation and random shocks by deep reinforcement learning
AU - Liu, Lujie
AU - Yang, Jun
AU - Zheng, Huiling
AU - Li, Lei
AU - Wang, Ning
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/5
Y1 - 2025/5
N2 - To improve the efficiency of UAVs, it is common for a UAV to perform multiple tasks during each departure. However, existing mission abort policies primarily focus on scenarios where the system executes a single task and are not suitable to more complex multi-task missions. Moreover, in practice, degradation and random shocks often occur simultaneously, while existing studies typically only consider their separate effects on mission abort policies. To solve these problems, a multi-task selective execution policy considering the stochastic dependence between degradation and shocks is proposed to determine the next task for the system or the timing for mission abort. First, considering the health state of the system, location information, and the completion state of tasks, a multi-task selective execution policy is proposed. Next, to maximize the cumulative reward of the system, the corresponding sequential decision problem is formulated as a Markov Decision Process. Then, to address the dimensionality curse of continuous state space, a solution method based on deep reinforcement learning algorithms is tailored, incorporating an action masking technique to avoid repeated selection of already executed tasks. Finally, the effectiveness of the proposed method is verified through a numerical study using a UAV for multiple reconnaissance tasks.
AB - To improve the efficiency of UAVs, it is common for a UAV to perform multiple tasks during each departure. However, existing mission abort policies primarily focus on scenarios where the system executes a single task and are not suitable to more complex multi-task missions. Moreover, in practice, degradation and random shocks often occur simultaneously, while existing studies typically only consider their separate effects on mission abort policies. To solve these problems, a multi-task selective execution policy considering the stochastic dependence between degradation and shocks is proposed to determine the next task for the system or the timing for mission abort. First, considering the health state of the system, location information, and the completion state of tasks, a multi-task selective execution policy is proposed. Next, to maximize the cumulative reward of the system, the corresponding sequential decision problem is formulated as a Markov Decision Process. Then, to address the dimensionality curse of continuous state space, a solution method based on deep reinforcement learning algorithms is tailored, incorporating an action masking technique to avoid repeated selection of already executed tasks. Finally, the effectiveness of the proposed method is verified through a numerical study using a UAV for multiple reconnaissance tasks.
KW - Deep reinforcement learning
KW - Degradation
KW - Mission abort
KW - Multi-task selective execution policy
KW - Random shocks
KW - Stochastic dependence
UR - https://www.scopus.com/pages/publications/85215829405
U2 - 10.1016/j.ress.2025.110844
DO - 10.1016/j.ress.2025.110844
M3 - 文章
AN - SCOPUS:85215829405
SN - 0951-8320
VL - 257
JO - Reliability Engineering and System Safety
JF - Reliability Engineering and System Safety
M1 - 110844
ER -