跳到主要导航 跳到搜索 跳到主要内容

Online reinforcement learning for a class of partially unknown continuous-time nonlinear systems via value iteration

  • Hanguang Su
  • , Huaguang Zhang*
  • , Kun Zhang
  • , Wenzhong Gao
  • *此作品的通讯作者
  • Northeastern University China
  • University of Denver

科研成果: 期刊稿件文章同行评审

摘要

In this paper, a modified value iteration–based approximate dynamic programming method is proposed for a class of affine nonlinear continuous-time systems, whose dynamics are partially unknown. The value iteration algorithm is established in an online fashion, and the convergence proof is given. To attenuate the effect caused by the unascertained characteristics of the system dynamics, the integral reinforcement learning scheme is also used. In the proposed approximate dynamic programming method, it is emphasized that the single-network structure is utilized to estimate the value functions and the control policies. That is, the iteration process is implemented on the actor/critic structure, in which case only the critic NN is required to be identified. Then, the least-squares scheme is derived for the NN weights updating. Finally, a linear system and a nonlinear system are tested to evaluate the performance of the proposed online value iteration algorithm. Both of the examples show the feasibility and effectiveness of the proposed algorithms.

源语言英语
页(从-至)1011-1028
页数18
期刊Optimal Control Applications and Methods
39
2
DOI
出版状态已出版 - 1 3月 2018
已对外发布

指纹

探究 'Online reinforcement learning for a class of partially unknown continuous-time nonlinear systems via value iteration' 的科研主题。它们共同构成独一无二的指纹。

引用此