Abstract
In this paper, we present a simple and effective topic correlation model (TCM) for cross-modal multimedia retrieval by jointly modeling the text and image components in multimedia documents. In this model, the image component is represented by the bag-of-features model based on local scale-invariant feature transform features, meanwhile the text component is described by a topic distribution learned from a latent topic model. Statistical correlations between these two mid-level features are investigated by mapping them into a semantic space. These cross-modality correlations are used to calculate the conditional probabilities of answers in one modality while given query in the other modality. The model is tested on three cross-modal retrieval benchmark problems including Wikipedia documents in both English and Chinese. Experimental results have demonstrated that the new TCM model achieves the best performance compared to recent state-of-the-art cross-modal retrieval models on the given benchmarks.
| Original language | English |
|---|---|
| Pages (from-to) | 1007-1022 |
| Number of pages | 16 |
| Journal | Pattern Analysis and Applications |
| Volume | 19 |
| Issue number | 4 |
| DOIs | |
| State | Published - 1 Nov 2016 |
Keywords
- Bag-of-features model
- Cross-modal multimedia retrieval
- Topic correlation model
- Topic models
Fingerprint
Dive into the research topics of 'Topic correlation model for cross-modal multimedia information retrieval'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver