跳到主要导航 跳到搜索 跳到主要内容

Seek Commonality but Preserve Differences: Dissected Dynamics Modeling for Multi-modal Visual RL

  • Yangru Huang
  • , Peixi Peng*
  • , Yifan Zhao
  • , Guangyao Chen
  • , Yonghong Tian*
  • *此作品的通讯作者
  • Peking University
  • Peng Cheng Laboratory

科研成果: 期刊稿件会议文章同行评审

摘要

Accurate environment dynamics modeling is crucial for obtaining effective state representations in visual reinforcement learning (RL) applications. However, when facing multiple input modalities, existing dynamics modeling methods (e.g., DeepMDP) usually stumble in addressing the complex and volatile relationship between different modalities. In this paper, we study the problem of efficient dynamics modeling for multi-modal visual RL. We find that under the existence of modality heterogeneity, modality-correlated and distinct features are equally important but play different roles in reflecting the evolution of environmental dynamics. Motivated by this fact, we propose Dissected Dynamics Modeling (DDM), a novel multi-modal dynamics modeling method for visual RL. Unlike existing methods, DDM explicitly distinguishes consistent and inconsistent information across modalities and treats them separately with a divide-and-conquer strategy. This is done by dispatching the features carrying different information into distinct dynamics modeling pathways, which naturally form a series of implicit regularizations along the learning trajectories. In addition, a reward predictive function is further introduced to filter task-irrelevant information in both modality-consistent and inconsistent features, ensuring information integrity while avoiding potential distractions. Extensive experiments show that DDM consistently achieves competitive performance in challenging multi-modal visual environments. The code is available in this link: https://github.com/Yara-HYR/DDM.

源语言英语
期刊Advances in Neural Information Processing Systems
37
出版状态已出版 - 2024
活动38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, 加拿大
期限: 9 12月 202415 12月 2024

指纹

探究 'Seek Commonality but Preserve Differences: Dissected Dynamics Modeling for Multi-modal Visual RL' 的科研主题。它们共同构成独一无二的指纹。

引用此