跳到主要导航 跳到搜索 跳到主要内容

From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning

  • Beihang University
  • Zhongguancun Laboratory

科研成果: 期刊稿件会议文章同行评审

摘要

Safe reinforcement learning (RL) requires the agent to finish a given task while obeying specific constraints. Giving constraints in natural language form has great potential for practical scenarios due to its flexible transfer capability and accessibility. Previous safe RL methods with natural language constraints typically need to design cost functions manually for each constraint, which requires domain expertise and lacks flexibility. In this paper, we harness the dual role of text in this task, using it not only to provide constraint but also as a training signal. We introduce the Trajectory-level Textual Constraints Translator (TTCT) to replace the manually designed cost function. Our empirical results demonstrate that TTCT effectively comprehends textual constraint and trajectory, and the policies trained by TTCT can achieve a lower violation rate than the standard cost function. Extra studies are conducted to demonstrate that the TTCT has zero-shot transfer capability to adapt to constraint-shift environments.

源语言英语
期刊Advances in Neural Information Processing Systems
37
出版状态已出版 - 2024
活动38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, 加拿大
期限: 9 12月 202415 12月 2024

指纹

探究 'From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning' 的科研主题。它们共同构成独一无二的指纹。

引用此