摘要
In this paper, a modified value iteration–based approximate dynamic programming method is proposed for a class of affine nonlinear continuous-time systems, whose dynamics are partially unknown. The value iteration algorithm is established in an online fashion, and the convergence proof is given. To attenuate the effect caused by the unascertained characteristics of the system dynamics, the integral reinforcement learning scheme is also used. In the proposed approximate dynamic programming method, it is emphasized that the single-network structure is utilized to estimate the value functions and the control policies. That is, the iteration process is implemented on the actor/critic structure, in which case only the critic NN is required to be identified. Then, the least-squares scheme is derived for the NN weights updating. Finally, a linear system and a nonlinear system are tested to evaluate the performance of the proposed online value iteration algorithm. Both of the examples show the feasibility and effectiveness of the proposed algorithms.
| 源语言 | 英语 |
|---|---|
| 页(从-至) | 1011-1028 |
| 页数 | 18 |
| 期刊 | Optimal Control Applications and Methods |
| 卷 | 39 |
| 期 | 2 |
| DOI | |
| 出版状态 | 已出版 - 1 3月 2018 |
| 已对外发布 | 是 |
指纹
探究 'Online reinforcement learning for a class of partially unknown continuous-time nonlinear systems via value iteration' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver