Skip to main navigation Skip to search Skip to main content

跨媒体智能关联分析与语义理解理论与技术研究进展

Translated title of the contribution: Advances in Theory and Technology of Cross-Media Intelligent Association Analysis
  • Junqing Yu
  • , Xin Wang
  • , Kun Kuang
  • , Si Liu
  • , Xinfeng Zhang
  • , Zikai Song
  • Huazhong University of Science and Technology
  • Tsinghua University
  • Zhejiang University
  • University of Chinese Academy of Sciences

Research output: Contribution to journalArticlepeer-review

Abstract

This paper provides an analysis of the latest research trends of theories and technologies in cross-media intelligent correlation analysis and semantic understanding. The main content of this report includes a unified representation of cross-media information, knowledge-guided data fusion, cross-media correlation analysis, cross-media knowledge graph, and intelligent applications for multi-modal. Unified representations are preconditions for analyzing and inference about multi-modal information. The semantic consistency between multi-modal information is utilized to eliminate redundant information and achieve unified representation through cross-modal interconversion to learn more comprehensive feature representation. The cross-media association analysis focuses on image-language, video-language, and audio-video-language, aiming to bridge the semantic gap between visual, auditory, language, and fully establish the semantic association between different modalities. By introducing the construction of cross-media knowledge graph, cross-media knowledge graph construction, cross-media knowledge graph embedding, and cross-media knowledge inference, the cross-media representation based on knowledge graph enhances the reliability and improves the efficiency and accuracy of subsequent inference tasks. With the rapid development of cross-modal analysis, intelligent applications for multi-modal are supported by more technologies. According to the required domain knowledge, this paper selects cross-modal applications such as multi-modal visual question answering, multi-modal video summarization, multi-modal visual pattern mining, multi-modal recommendation, cross-modal intelligent inference, and cross-modal medical image prediction, their research progress is compared and reviewed in terms of multi-modal fusion and cross-media inference.

Translated title of the contributionAdvances in Theory and Technology of Cross-Media Intelligent Association Analysis
Original languageChinese (Traditional)
Pages (from-to)1-22
Number of pages22
JournalJisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics
Volume35
Issue number1
DOIs
StatePublished - Jan 2023

Fingerprint

Dive into the research topics of 'Advances in Theory and Technology of Cross-Media Intelligent Association Analysis'. Together they form a unique fingerprint.

Cite this