跳到主要导航 跳到搜索 跳到主要内容

DuoNet: Joint optimization of representation learning and prototype classifier for unbiased scene graph generation

  • Zhaodi Wang
  • , Biao Leng*
  • , Shuo Zhang
  • *此作品的通讯作者
  • Beihang University
  • Beijing Jiaotong University

科研成果: 期刊稿件文章同行评审

摘要

Unbiased Scene Graph Generation (SGG) aims to parse visual scenes into highly informative graphs under the long-tail challenge. While prototype-based methods have shown promise in unbiased SGG, they highlight the importance of learning discriminative features that are intra-class compact and inter-class separable. In this paper, we revisit prototype-based methods and analyze critical roles of representation learning and prototype classifier in driving unbiased SGG, and accordingly propose a novel framework DuoNet. To enhance intra-class compactness, we introduce a Bi-Directional Representation Refinement (BiDR2) module that captures relation-sensitive visual variability and within-relation visual consistency of entities. This module adopts relation-to-entity-to-relation refinement by integrating dual-level relation pattern modeling with a relation-specific entity constraint. Furthermore, a Knowledge-Guided Prototype Learning (KGPL) module is devised to strengthen inter-class separability by constructing an equidistributed prototypical classifier with maximum inter-class margins. The equidistributed prototype classifier is frozen during SGG training to mitigate long-tail bias, thus a knowledge-driven triplet loss is developed to strengthen the learning of BiDR2, enhancing relation-prototype matching. Extensive experiments demonstrate the effectiveness of our method, which sets new state-of-the-art performance on Visual Genome, GQA and Open Images datasets.

源语言英语
文章编号113152
期刊Pattern Recognition
176
DOI
出版状态已出版 - 8月 2026

指纹

探究 'DuoNet: Joint optimization of representation learning and prototype classifier for unbiased scene graph generation' 的科研主题。它们共同构成独一无二的指纹。

引用此