摘要
Accurate environment dynamics modeling is crucial for obtaining effective state representations in visual reinforcement learning (RL) applications. However, when facing multiple input modalities, existing dynamics modeling methods (e.g., DeepMDP) usually stumble in addressing the complex and volatile relationship between different modalities. In this paper, we study the problem of efficient dynamics modeling for multi-modal visual RL. We find that under the existence of modality heterogeneity, modality-correlated and distinct features are equally important but play different roles in reflecting the evolution of environmental dynamics. Motivated by this fact, we propose Dissected Dynamics Modeling (DDM), a novel multi-modal dynamics modeling method for visual RL. Unlike existing methods, DDM explicitly distinguishes consistent and inconsistent information across modalities and treats them separately with a divide-and-conquer strategy. This is done by dispatching the features carrying different information into distinct dynamics modeling pathways, which naturally form a series of implicit regularizations along the learning trajectories. In addition, a reward predictive function is further introduced to filter task-irrelevant information in both modality-consistent and inconsistent features, ensuring information integrity while avoiding potential distractions. Extensive experiments show that DDM consistently achieves competitive performance in challenging multi-modal visual environments. The code is available in this link: https://github.com/Yara-HYR/DDM.
| 源语言 | 英语 |
|---|---|
| 期刊 | Advances in Neural Information Processing Systems |
| 卷 | 37 |
| 出版状态 | 已出版 - 2024 |
| 活动 | 38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, 加拿大 期限: 9 12月 2024 → 15 12月 2024 |
指纹
探究 'Seek Commonality but Preserve Differences: Dissected Dynamics Modeling for Multi-modal Visual RL' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver