跳到主要导航 跳到搜索 跳到主要内容

Multi-Agent Amodal Completion: Direct Synthesis with Fine-Grained Semantic Guidance

  • Hongxing Fan
  • , Lipeng Wang
  • , Haohua Chen
  • , Zehuan Huang
  • , Jiangtao Wu
  • , Lu Sheng*
  • *此作品的通讯作者
  • Beihang University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Amodal completion, generating invisible parts of occluded objects, is vital for applications like image editing and AR. Prior methods face challenges with data needs, generalization, or error accumulation in progressive pipelines. We propose a Collaborative Multi-Agent Reasoning Framework based on upfront collaborative reasoning to overcome these issues. Our framework uses multiple agents to collaboratively analyze occlusion relationships and determine necessary boundary expansion, yielding a precise mask for inpainting. Concurrently, an agent generates fine-grained textual descriptions, enabling Fine-Grained Semantic Guidance. This ensures accurate object synthesis and prevents the regeneration of occluders or other unwanted elements, especially within large inpainting areas. Furthermore, our method directly produces layered RGBA outputs guided by visible masks and attention maps from a Diffusion Transformer, eliminating extra segmentation. Extensive evaluations demonstrate our framework achieves state-of-the-art visual quality.

源语言英语
主期刊名MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025
出版商Association for Computing Machinery, Inc
9911-9919
页数9
ISBN(电子版)9798400720352
DOI
出版状态已出版 - 27 10月 2025
活动33rd ACM International Conference on Multimedia, MM 2025 - Dublin, 爱尔兰
期限: 27 10月 202531 10月 2025

出版系列

姓名MM 2025 - Proceedings of the 33rd ACM International Conference on Multimedia, Co-Located with MM 2025

会议

会议33rd ACM International Conference on Multimedia, MM 2025
国家/地区爱尔兰
Dublin
时期27/10/2531/10/25

指纹

探究 'Multi-Agent Amodal Completion: Direct Synthesis with Fine-Grained Semantic Guidance' 的科研主题。它们共同构成独一无二的指纹。

引用此