跳到主要导航 跳到搜索 跳到主要内容

Explaining the semantics capturing capability of scene graph generation models

  • Jie Luo*
  • , Jia Zhao
  • , Bin Wen
  • , Yuhang Zhang
  • *此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Deep neural network is a effective way for scene graph generation tasks. However, it also makes the scene graph generation models difficult to explain. For instance, the current standard metric cannot explain how capable neural network models are of capturing the semantics of relations. In this paper, we try to understand the semantics capturing capability of scene graph generation models based on three types of metrics: conformance recall, violation recall, and non-violation recall, which measure semantic properties of relations that are reflected by triples in scene graph generated by models. Evaluation of these metrics on three representative state-of-the-art scene graph generation models based on deep neural network in Visual Genome dataset shows that the proposed metrics can effectively explain the capability of models to capture different semantic properties and identify design problems in models. By extending the Visual Genome dataset with different sets of additional annotations, these metrics can also explaining whether the semantics capturing capability of deep neural network models can be improved by data enhancement.

源语言英语
文章编号107427
期刊Pattern Recognition
110
DOI
出版状态已出版 - 2月 2021

指纹

探究 'Explaining the semantics capturing capability of scene graph generation models' 的科研主题。它们共同构成独一无二的指纹。

引用此