Skip to main navigation Skip to search Skip to main content

Augmenting image descriptions using structured prediction output

  • Yahong Han*
  • , Xingxing Wei
  • , Xiaochun Cao
  • , Yi Yang
  • , Xiaofang Zhou
  • *Corresponding author for this work
  • Tianjin University
  • CAS - Institute of Information Engineering
  • University of Queensland

Research output: Contribution to journalArticlepeer-review

Abstract

The need for richer descriptions of images arises in a wide spectrum of applications ranging from image understanding to image retrieval. While the Automatic Image Annotation (AIA) has been extensively studied, image descriptions with the output labels lack sufficient information. This paper proposes to augment image descriptions using structured prediction output. We define a hierarchical tree-structured semantic unit to describe images, from which we can obtain not only the class and subclass one image belongs to, but also the attributes one image has. After defining a new feature map function of structured SVM, we decompose the loss function into every node of the hierarchical tree-structured semantic unit and then predict the tree-structured semantic unit for testing images. In the experiments, we evaluate the performance of the proposed method on two open benchmark datasets and compare with the state-of-the-art methods. Experimental results show the better prediction performance of the proposed method and demonstrate the strength of augmenting image descriptions.

Original languageEnglish
Article number6810013
Pages (from-to)1665-1676
Number of pages12
JournalIEEE Transactions on Multimedia
Volume16
Issue number6
DOIs
StatePublished - 1 Oct 2014
Externally publishedYes

Keywords

  • Image descriptions
  • image annotation
  • structured learning
  • tree-structured semantic unit

Fingerprint

Dive into the research topics of 'Augmenting image descriptions using structured prediction output'. Together they form a unique fingerprint.

Cite this