TY - JOUR
T1 - An online inverse optimal control method for learning human behavior in a class of noisy discrete-time linear HiTL systems
AU - Li, Wen Hua
AU - Wu, Huai Ning
N1 - Publisher Copyright:
© 2026 Elsevier B.V.
PY - 2026/3/28
Y1 - 2026/3/28
N2 - The main goal of human behavior learning (HBL) is to enable machines to better understand and imitate human behavior, thereby enhancing their adaptability and intelligence. This paper explores the real-time learning of human behavior in noisy, discrete-time, linear human-in-the-loop (HiTL) systems by combining moving horizon estimation (MHE), recursive least squares (RLS) and linear matrix inequality (LMI)-based optimization. We assume that human behavior can be modeled using a discrete-time optimal control (DTOC) framework, where the quadratic cost function includes an unknown weighting matrix that characterizes the human decision-making process. To mitigate the effects of noise, we first employ MHE to estimate the state trajectory by minimizing an objective function over a moving, fixed-size estimation window. Then, using the estimated states and control inputs, we identify the state feedback gain via RLS, for which we provide a convergence proof. Once the state feedback gain is obtained, we recover the cost matrix via an LMI-based optimization. Finally, simulation results on a steering assistance system for intelligent vehicles validate the proposed approach.
AB - The main goal of human behavior learning (HBL) is to enable machines to better understand and imitate human behavior, thereby enhancing their adaptability and intelligence. This paper explores the real-time learning of human behavior in noisy, discrete-time, linear human-in-the-loop (HiTL) systems by combining moving horizon estimation (MHE), recursive least squares (RLS) and linear matrix inequality (LMI)-based optimization. We assume that human behavior can be modeled using a discrete-time optimal control (DTOC) framework, where the quadratic cost function includes an unknown weighting matrix that characterizes the human decision-making process. To mitigate the effects of noise, we first employ MHE to estimate the state trajectory by minimizing an objective function over a moving, fixed-size estimation window. Then, using the estimated states and control inputs, we identify the state feedback gain via RLS, for which we provide a convergence proof. Once the state feedback gain is obtained, we recover the cost matrix via an LMI-based optimization. Finally, simulation results on a steering assistance system for intelligent vehicles validate the proposed approach.
KW - Discrete-time optimal control
KW - Human behavior learning
KW - Inverse optimal control
KW - Moving horizon estimation
KW - Recursive least squares
UR - https://www.scopus.com/pages/publications/105027401626
U2 - 10.1016/j.neucom.2026.132647
DO - 10.1016/j.neucom.2026.132647
M3 - 文章
AN - SCOPUS:105027401626
SN - 0925-2312
VL - 671
JO - Neurocomputing
JF - Neurocomputing
M1 - 132647
ER -