跳到主要导航 跳到搜索 跳到主要内容

Duplicate Multi-modal Entities Detection with Graph Contrastive Self-training Network

  • Shuyun Gu
  • , Xiao Wang
  • , Chuan Shi*
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Duplicate multi-modal entities detection aims to find highly similar entities from massive entities with multi-modal information, which is a basic task in many applications and becoming more important and urgent with the development of Internet and e-commerce platforms. Traditional methods employ machine learning or deep learning on feature embedding extracted from multi-modal information, which ignores the correlation among entities and modals. Inspired by the popular Graph Neural Networks (GNNs), we can analyze the multi-relation graph of entities constructed from their multi-modal information with GNN. However, this solution still faces the extreme label sparsity challenge, particularly in industrial applications. In this work, we propose a novel graph contrastive self-training network model, named CT-GNN, for duplicate multi-modal entities detection with extreme label sparsity. With the multi-relation graph of entities constructed from multi-modal features of entities with KNN, we first learn the preliminary node embeddings with existing GNN, e.g., GCNs. To alleviate the problem of extremely sparse labels, we design a layer contrastive module to effectively exploit implicit label information, as well as a pseudo labels extension module to determine label boundary. In addition, graph structure learning is introduced to refine the structure of the multi-relation graph. A uniform optimization framework is designed to seamlessly integrate these three components. Sufficient experiments on real datasets, in comparison with SOTA baselines, well demonstrate the effectiveness of our proposed method.

源语言英语
主期刊名Machine Learning and Knowledge Discovery in Databases
主期刊副标题Research Track - European Conference, ECML PKDD 2023, Proceedings
编辑Danai Koutra, Claudia Plant, Manuel Gomez Rodriguez, Elena Baralis, Francesco Bonchi
出版商Springer Science and Business Media Deutschland GmbH
651-665
页数15
ISBN(印刷版)9783031434143
DOI
出版状态已出版 - 2023
已对外发布
活动23rd Joint European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2023 - Turin, 意大利
期限: 18 9月 202322 9月 2023

出版系列

姓名Lecture Notes in Computer Science
14170 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议23rd Joint European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2023
国家/地区意大利
Turin
时期18/09/2322/09/23

指纹

探究 'Duplicate Multi-modal Entities Detection with Graph Contrastive Self-training Network' 的科研主题。它们共同构成独一无二的指纹。

引用此