DOPT: D-Learning with Off-Policy Target toward Sample Efficiency and Fast Convergence Control

  • Zhaolong Shen
  • , Quan Quan*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent times, Lyapunov theory has been in-corporated into learning-based control methods to provide a stability guarantee. However, merely satisfying the Lyapunov conditions does not fully leverage the capabilities of the Neural Network (NN) controller. Furthermore, training an effective Lyapunov candidate requires substantial data, which inherently results in sample inefficiency. To address these limitations, we propose an off-policy variant of the vanilla D-learning method that uses current and historical data to iteratively enhance the NN controller within the framework of Lyapunov theory. Our method outperforms the Deep Deterministic Policy Gradient (DDPG) and D-learning in terms of stability, sample efficiency, and the quality of the trained controllers and Lyapunov candidates. Link to code: github.com/Shenzhaolong1330/DOPT.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Robotics and Automation, ICRA 2025
EditorsChristian Ott, Henny Admoni, Sven Behnke, Stjepan Bogdan, Aude Bolopion, Youngjin Choi, Fanny Ficuciello, Nicholas Gans, Clement Gosselin, Kensuke Harada, Erdal Kayacan, H. Jin Kim, Stefan Leutenegger, Zhe Liu, Perla Maiolino, Lino Marques, Takamitsu Matsubara, Anastasia Mavromatti, Mark Minor, Jason O'Kane, Hae Won Park, Hae-Won Park, Ioannis Rekleitis, Federico Renda, Elisa Ricci, Laurel D. Riek, Lorenzo Sabattini, Shaojie Shen, Yu Sun, Pierre-Brice Wieber, Katsu Yamane, Jingjin Yu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages9637-9643
Number of pages7
ISBN (Electronic)9798331541392
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Robotics and Automation, ICRA 2025 - Atlanta, United States
Duration: 19 May 202523 May 2025

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2025 IEEE International Conference on Robotics and Automation, ICRA 2025
Country/TerritoryUnited States
CityAtlanta
Period19/05/2523/05/25

Fingerprint

Dive into the research topics of 'DOPT: D-Learning with Off-Policy Target toward Sample Efficiency and Fast Convergence Control'. Together they form a unique fingerprint.

Cite this