Skip to main navigation Skip to search Skip to main content

Annotation modification for fine-grained visual recognition

  • Changzhi Luo
  • , Zhijun Meng*
  • , Jiashi Feng
  • , Bingbing Ni
  • , Meng Wang
  • *Corresponding author for this work
  • Hefei University of Technology
  • National University of Singapore
  • Shanghai Jiao Tong University

Research output: Contribution to journalArticlepeer-review

Abstract

Query modification is an intensively studied and widely used technique in information retrieval, for it helps better understand the intention of the users. In this work, we introduce this idea into fine-grained visual recognition, which is important to ambiguous queries in image retrieval task. Unlike most existing works, which incorporate information about object bounding boxes or parts for extracting discriminative local features, we propose a novel approach from a new viewpoint to solve the fine-grained recognition problem, namely annotation modification. The proposed approach fully exploits the inter-class ambiguity (which is generally regarded as noise) to form active sets of annotations for boosting the fine-grained visual recognition. Specifically, it first obtains some most confusing classes of each image through an easy-to-evaluate classifier, and then modify the annotation of each image using the active set of annotations. To handle the modified annotations, a novel ranking based loss function is further designed to learn effective classification models. We evaluate the proposed approach on three popular fine-grained image datasets (i.e., Oxford-IIIT Pets, Flower-102 and CUB200-2011), and the experimental results clearly demonstrate its effectiveness.

Original languageEnglish
Pages (from-to)58-65
Number of pages8
JournalNeurocomputing
Volume274
DOIs
StatePublished - 24 Jan 2018

Keywords

  • Active set
  • Annotation modification
  • Fine-grained visual recognition
  • Query modification
  • Ranking loss

Fingerprint

Dive into the research topics of 'Annotation modification for fine-grained visual recognition'. Together they form a unique fingerprint.

Cite this