Online cross-modal scene retrieval by binary representation and semantic graph

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, cross-modal scene retrieval has atracted more atention. However, most existing approaches neglect the semantic relationship between objects in a scene together with the embedded spatial layouts. Moreover, these methods mostly apply the batch learning strategy, which is not suitable for processing streaming data. To address the aforementioned problems, we propose a new framework for online cross-modal scene retrieval based on binary representations and semantic graph. Specially, we adopt the cross-modal hashing based on the quantization loss of different modalities. By introducing the semantic graph, we are able to extract wealthy semantics and measure their correlation across different modalities. Further more, we propose a two-step optimization procedure based on stochastic gradient descent for online update. Experimental results on four datasets show the superiority of our approach over the state-of-the-art.

Original languageEnglish
Title of host publicationMM 2017 - Proceedings of the 2017 ACM Multimedia Conference
PublisherAssociation for Computing Machinery, Inc
Pages744-752
Number of pages9
ISBN (Electronic)9781450349062
DOIs
StatePublished - 23 Oct 2017
Event25th ACM International Conference on Multimedia, MM 2017 - Mountain View, United States
Duration: 23 Oct 201727 Oct 2017

Publication series

NameMM 2017 - Proceedings of the 2017 ACM Multimedia Conference

Conference

Conference25th ACM International Conference on Multimedia, MM 2017
Country/TerritoryUnited States
CityMountain View
Period23/10/1727/10/17

Keywords

  • Binary representations
  • Cross-modal hashing
  • Online learning
  • Scene retrieval
  • Semantic-graph

Fingerprint

Dive into the research topics of 'Online cross-modal scene retrieval by binary representation and semantic graph'. Together they form a unique fingerprint.

Cite this