Skip to main navigation Skip to search Skip to main content

Attentive relational networks for mapping images to scene graphs

  • Mengshi Qi
  • , Weijian Li
  • , Zhengyuan Yang
  • , Yunhong Wang*
  • , Jiebo Luo
  • *Corresponding author for this work
  • Beihang University
  • Beijing Advanced Innovation Center for Big Data and Brain Computing
  • University of Rochester

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Scene graph generation refers to the task of automatically mapping an image into a semantic structural graph, which requires correctly labeling each extracted object and their interaction relationships. Despite the recent success in object detection using deep learning techniques, inferring complex contextual relationships and structured graph representations from visual data remains a challenging topic. In this study, we propose a novel Attentive Relational Network that consists of two key modules with an object detection backbone to approach this problem. The first module is a semantic transformation module utilized to capture semantic embedded relation features, by translating visual features and linguistic features into a common semantic space. The other module is a graph self-attention module introduced to embed a joint graph representation through assigning various importance weights to neighboring nodes. Finally, accurate scene graphs are produced by the relation inference module to recognize all entities and corresponding relations. We evaluate our proposed method on the widely-adopted Visual Genome Dataset, and the results demonstrate the effectiveness and superiority of our model.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
PublisherIEEE Computer Society
Pages3952-3961
Number of pages10
ISBN (Electronic)9781728132938
DOIs
StatePublished - Jun 2019
Event32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States
Duration: 16 Jun 201920 Jun 2019

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2019-June
ISSN (Print)1063-6919

Conference

Conference32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Country/TerritoryUnited States
CityLong Beach
Period16/06/1920/06/19

Keywords

  • Scene Analysis and Understanding
  • Visual Reasoning

Fingerprint

Dive into the research topics of 'Attentive relational networks for mapping images to scene graphs'. Together they form a unique fingerprint.

Cite this