跳到主要导航 跳到搜索 跳到主要内容

QUEST: Quadruple Multimodal Contrastive Learning with Constraints and Self-Penalization

  • Beihang University
  • Zhongguancun Laboratory

科研成果: 期刊稿件会议文章同行评审

摘要

Multimodal contrastive learning (MCL) has recently demonstrated significant success across various tasks. However, the existing MCL treats all negative samples equally and ignores the potential semantic association with positive samples, which limits the model's ability to achieve fine-grained alignment. In multi-view scenarios, MCL tends to prioritize shared information while neglecting modality-specific unique information across different views, leading to feature suppression and suboptimal performance in downstream tasks. To address these limitations, we propose a novel contrastive framework named QUEST: Quadruple Multimodal Contrastive Learning with Constraints and Self-Penalization. In the QUEST framework, we propose quaternion contrastive objectives and orthogonal constraints to extract sufficient unique information. Meanwhile, a shared information-guided penalization is introduced to ensure that shared information does not excessively influence the optimization of unique information. Our method leverages quaternion vector spaces to simultaneously optimize shared and unique information. Experiments on multiple datasets show that our method achieves superior performance in multimodal contrastive learning benchmarks. On public benchmark, our approach achieves state-of-the-art performance, and on synthetic shortcut datasets, we outperform existing baseline methods by an average of 97.95% on the CLIP model.

源语言英语
期刊Advances in Neural Information Processing Systems
37
出版状态已出版 - 2024
活动38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, 加拿大
期限: 9 12月 202415 12月 2024

指纹

探究 'QUEST: Quadruple Multimodal Contrastive Learning with Constraints and Self-Penalization' 的科研主题。它们共同构成独一无二的指纹。

引用此