跳到主要导航 跳到搜索 跳到主要内容

Chain-of-Imagination for Reliable Instruction Following in Decision Making

  • Enshen Zhou
  • , Yiran Qin
  • , Zhenfei Yin
  • , Zhelun Shi
  • , Yuzhou Huang
  • , Ruimao Zhang*
  • , Lu Sheng*
  • , Jing Shao
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Enabling the embodied agent to imagine step-by-step the future states and sequentially approach these situation-aware states can enhance its capability to make reliable action decisions from textual instructions. In this work, we introduce a simple but effective mechanism called Chain-of-Imagination (CoI), which repeatedly employs a Multimodal Large Language Model (MLLM) equipped with diffusion model to facilitate imagining and acting upon the series of intermediate situation-aware visual sub-goals one by one, resulting in more reliable instruction-following capability. Based on the CoI mechanism, we propose an embodied agent DecisionDreamer as the low-level controller that can be adapted to different open-world scenarios. Extensive experiments demonstrate that Decision-Dreamer can achieve more reliable and accurate decision-making and significantly outperform the state-of-the-art generalist agents in the Minecraft and CALVIN sandbox simulators, regarding the instruction-following capability. For more demos, please see https://sites.google.com/view/decisiondreamer.

源语言英语
主期刊名IROS 2025 - 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, Conference Proceedings
编辑Christian Laugier, Alessandro Renzaglia, Nikolay Atanasov, Stan Birchfield, Grzegorz Cielniak, Leonardo De Mattos, Laura Fiorini, Philippe Giguere, Kenji Hashimoto, Javier Ibanez-Guzman, Tetsushi Kamegawa, Jinoh Lee, Giuseppe Loianno, Kevin Luck, Hisataka Maruyama, Philippe Martinet, Hadi Moradi, Urbano Nunes, Julien Pettre, Alberto Pretto, Tommaso Ranzani, Arne Ronnau, Silvia Rossi, Elliott Rouse, Fabio Ruggiero, Olivier Simonin, Danwei Wang, Ming Yang, Eiichi Yoshida, Huijing Zhao
出版商Institute of Electrical and Electronics Engineers Inc.
10010-10017
页数8
ISBN(电子版)9798331543938
DOI
出版状态已出版 - 2025
活动2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025 - Hangzhou, 中国
期限: 19 10月 202525 10月 2025

出版系列

姓名IEEE International Conference on Intelligent Robots and Systems
ISSN(印刷版)2153-0858
ISSN(电子版)2153-0866

会议

会议2025 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025
国家/地区中国
Hangzhou
时期19/10/2525/10/25

指纹

探究 'Chain-of-Imagination for Reliable Instruction Following in Decision Making' 的科研主题。它们共同构成独一无二的指纹。

引用此