TY - JOUR
T1 - Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
AU - Deng, Cheng
AU - Chen, Zhaojia
AU - Liu, Xianglong
AU - Gao, Xinbo
AU - Tao, Dacheng
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2018/8
Y1 - 2018/8
N2 - Given the benefits of its low storage requirements and high retrieval efficiency, hashing has recently received increasing attention. In particular, cross-modal hashing has been widely and successfully used in multimedia similarity search applications. However, almost all existing methods employing cross-modal hashing cannot obtain powerful hash codes due to their ignoring the relative similarity between heterogeneous data that contains richer semantic information, leading to unsatisfactory retrieval performance. In this paper, we propose a triplet-based deep hashing (TDH) network for cross-modal retrieval. First, we utilize the triplet labels, which describe the relative relationships among three instances as supervision in order to capture more general semantic correlations between cross-modal instances. We then establish a loss function from the inter-modal view and the intra-modal view to boost the discriminative abilities of the hash codes. Finally, graph regularization is introduced into our proposed TDH method to preserve the original semantic similarity between hash codes in Hamming space. Experimental results show that our proposed method outperforms several state-of-the-art approaches on two popular cross-modal data sets.
AB - Given the benefits of its low storage requirements and high retrieval efficiency, hashing has recently received increasing attention. In particular, cross-modal hashing has been widely and successfully used in multimedia similarity search applications. However, almost all existing methods employing cross-modal hashing cannot obtain powerful hash codes due to their ignoring the relative similarity between heterogeneous data that contains richer semantic information, leading to unsatisfactory retrieval performance. In this paper, we propose a triplet-based deep hashing (TDH) network for cross-modal retrieval. First, we utilize the triplet labels, which describe the relative relationships among three instances as supervision in order to capture more general semantic correlations between cross-modal instances. We then establish a loss function from the inter-modal view and the intra-modal view to boost the discriminative abilities of the hash codes. Finally, graph regularization is introduced into our proposed TDH method to preserve the original semantic similarity between hash codes in Hamming space. Experimental results show that our proposed method outperforms several state-of-the-art approaches on two popular cross-modal data sets.
KW - Deep neural network
KW - cross-modal retrieval
KW - graph regularization
KW - hashing
KW - triplet labels
UR - https://www.scopus.com/pages/publications/85046943763
U2 - 10.1109/TIP.2018.2821921
DO - 10.1109/TIP.2018.2821921
M3 - 文章
C2 - 29993656
AN - SCOPUS:85046943763
SN - 1057-7149
VL - 27
SP - 3893
EP - 3903
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 8
ER -