EFFICIENT FINE-GRAINED VISUAL-TEXT SEARCH USING ADVERSARIALLY-LEARNED HASH CODES

  • Yongzhi Li
  • , Yadong Mu*
  • , Nan Zhuang
  • , Xianglong Liu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Cross-modal hashing for efficient visual-text search has attracted much research enthusiasm in recent years. The main argument of this work is that existing hashing methods mainly exploit a multi-label matching paradigm, ignoring various fine-grained semantics (high-order relationships, object attributes, etc.) in the multi-modal data. This paper explores cross-modal hashing from two rarely-explored aspects: first, we propose an efficient two-step hashing scheme that quickly screens irrelevant samples with global feature and then generate fine-grained feature guided by high-order concepts to rerank the survived candidates. Secondly, the robustness of the cross-modal hashing model, particularly under subtle tampering of fine-grained queries, is formally investigated. We propose a rephrase and adversarial training strategy for obtaining better performance and robustness. Comprehensive experiments and ablation studies on two large public datasets (MS-COCO and Flickr30K) demonstrate the proposed method's superiority in terms of both efficiency and accuracy.

Original languageEnglish
Title of host publication2021 IEEE International Conference on Multimedia and Expo, ICME 2021
PublisherIEEE Computer Society
ISBN (Electronic)9781665438643
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Multimedia and Expo, ICME 2021 - Shenzhen, China
Duration: 5 Jul 20219 Jul 2021

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2021 IEEE International Conference on Multimedia and Expo, ICME 2021
Country/TerritoryChina
CityShenzhen
Period5/07/219/07/21

Keywords

  • Adversarial Learning
  • Cross-modal Retrieval
  • Fine-grained Search
  • Hashing

Fingerprint

Dive into the research topics of 'EFFICIENT FINE-GRAINED VISUAL-TEXT SEARCH USING ADVERSARIALLY-LEARNED HASH CODES'. Together they form a unique fingerprint.

Cite this