TY - JOUR
T1 - FMMDP
T2 - failure monitoring approach for DNN-based Markov decision process
AU - Cai, Yi
AU - Lin, Weibin
AU - Jing, Chao
AU - Liu, Zhihao
AU - Zheng, Zheng
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
PY - 2026/4
Y1 - 2026/4
N2 - Markov Decision Process (MDP) serves as the fundamental mathematical framework for numerous sequential decision-making problems in real-world applications, with extensive implementation in complex Cyber-Physical Systems (CPS) such as autonomous driving and robotic control, where safety assurance is paramount. Building upon DRLFailureMonitor, our previous work on failure monitoring for Deep Reinforcement Learning methods, this paper presents FMMDP, a novel grey-box failure monitoring framework applicable to all MDP-based decision processes. FMMDP models failure evolution processes by capturing state-action sequences and employs multivariate time series classification techniques to learn failure patterns. Our framework introduces a state extraction module, enabling broad applicability across diverse MDP environments including both sensor-based and vision-based systems. Comprehensive evaluation across six distinct environments demonstrates FMMDP’s superior monitoring capabilities, achieving perfect recall (1.0) with an average false positive rate of merely 0.024, while providing an average advance warning of 19 timesteps before failures occur. Comparative analysis against ThirdEye and MC-Dropout reveals FMMDP’s significant advantages in both accuracy and computational efficiency. Parameter sensitivity studies indicate that state-action compositions offer optimal balance between precision and warning time, with sequence lengths of 20-30 timesteps yielding optimal performance across most environments.
AB - Markov Decision Process (MDP) serves as the fundamental mathematical framework for numerous sequential decision-making problems in real-world applications, with extensive implementation in complex Cyber-Physical Systems (CPS) such as autonomous driving and robotic control, where safety assurance is paramount. Building upon DRLFailureMonitor, our previous work on failure monitoring for Deep Reinforcement Learning methods, this paper presents FMMDP, a novel grey-box failure monitoring framework applicable to all MDP-based decision processes. FMMDP models failure evolution processes by capturing state-action sequences and employs multivariate time series classification techniques to learn failure patterns. Our framework introduces a state extraction module, enabling broad applicability across diverse MDP environments including both sensor-based and vision-based systems. Comprehensive evaluation across six distinct environments demonstrates FMMDP’s superior monitoring capabilities, achieving perfect recall (1.0) with an average false positive rate of merely 0.024, while providing an average advance warning of 19 timesteps before failures occur. Comparative analysis against ThirdEye and MC-Dropout reveals FMMDP’s significant advantages in both accuracy and computational efficiency. Parameter sensitivity studies indicate that state-action compositions offer optimal balance between precision and warning time, with sequence lengths of 20-30 timesteps yielding optimal performance across most environments.
KW - Failure monitoring
KW - Markov decision process
KW - Multivariate time series classification
UR - https://www.scopus.com/pages/publications/105024063125
U2 - 10.1007/s10664-025-10757-4
DO - 10.1007/s10664-025-10757-4
M3 - 文章
AN - SCOPUS:105024063125
SN - 1382-3256
VL - 31
JO - Empirical Software Engineering
JF - Empirical Software Engineering
IS - 2
M1 - 36
ER -