跳到主要导航 跳到搜索 跳到主要内容

DOMR: Establishing Cross-View Segmentation via Dense Object Matching

  • Jitong Liao
  • , Yulu Gao
  • , Shaofei Huang
  • , Jialin Gao
  • , Jie Lei
  • , Ronghua Liang
  • , Si Liu*
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Cross-view object correspondence involves matching objects between egocentric (first-person) and exocentric (third-person) views. It is a critical yet challenging task for visual understanding. In this work, we propose the Dense Object Matching and Refinement (DOMR) framework to establish dense object correspondences across views. The framework centers around the Dense Object Matcher (DOM) module, which jointly models multiple objects. Unlike methods that directly match individual object masks to image features, DOM leverages both positional and semantic relationships among objects to find correspondences. DOM integrates a proposal generation module with a dense matching module that jointly encodes visual, spatial, and semantic cues, explicitly constructing inter-object relationships to achieve dense matching among objects. Furthermore, we combine DOM with a mask refinement head designed to improve the completeness and accuracy of the predicted masks, forming the complete DOMR framework. Extensive evaluations on the Ego-Exo4D benchmark demonstrate that our approach achieves state-of-the-art performance with a mean IoU of 49.7% on Ego→Exo and 55.2% on Exo→Ego. These results outperform those of previous methods by 5.8% and 4.3%, respectively, validating the effectiveness of our integrated approach for cross-view understanding.

源语言英语
主期刊名MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
出版商Association for Computing Machinery, Inc
412-421
页数10
ISBN(电子版)9798400720352
DOI
出版状态已出版 - 27 10月 2025
活动33rd ACM International Conference on Multimedia, MM 2025 - Dublin, 爱尔兰
期限: 27 10月 202531 10月 2025

出版系列

姓名MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025

会议

会议33rd ACM International Conference on Multimedia, MM 2025
国家/地区爱尔兰
Dublin
时期27/10/2531/10/25

指纹

探究 'DOMR: Establishing Cross-View Segmentation via Dense Object Matching' 的科研主题。它们共同构成独一无二的指纹。

引用此