跳到主要导航 跳到搜索 跳到主要内容

Miko: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery

  • Feihong Lu
  • , Weiqi Wang
  • , Yangyifei Luo
  • , Ziqin Zhu
  • , Qingyun Sun*
  • , Baixuan Xu
  • , Haochen Shi
  • , Shiqi Gao
  • , Qian Li
  • , Yangqiu Song
  • , Jianxin Li
  • *此作品的通讯作者
  • Beihang University
  • Hong Kong University of Science and Technology
  • Beijing University of Posts and Telecommunications

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Social media has become ubiquitous for connecting with others, staying updated with news, expressing opinions, and finding entertainment. However, understanding the intention behind social media posts remains challenging due to the implicit and commonsense nature of these intentions, the need for cross-modality understanding of both text and images, and the presence of noisy information such as hashtags, misspelled words, and complicated abbreviations. To address these challenges, we present MIKO, a Multimodal Intention Knowledge DistillatiOn framework that collaboratively leverages a Large Language Model (LLM) and a Multimodal Large Language Model (MLLM) to uncover users' intentions. Specifically, our approach uses an MLLM to interpret the image, an LLM to extract key information from the text, and another LLM to generate intentions. By applying MIKO to publicly available social media datasets, we construct an intention knowledge base featuring 1,372K intentions rooted in 137,287 posts. Moreover, We conduct a two-stage annotation to verify the quality of the generated knowledge and benchmark the performance of widely used LLMs for intention generation, and further apply MIKO to a sarcasm detection dataset and distill a student model to demonstrate the downstream benefits of applying intention knowledge.

源语言英语
主期刊名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
出版商Association for Computing Machinery, Inc
3303-3312
页数10
ISBN(电子版)9798400706868
DOI
出版状态已出版 - 28 10月 2024
活动32nd ACM International Conference on Multimedia, MM 2024 - Melbourne, 澳大利亚
期限: 28 10月 20241 11月 2024

出版系列

姓名MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia

会议

会议32nd ACM International Conference on Multimedia, MM 2024
国家/地区澳大利亚
Melbourne
时期28/10/241/11/24

指纹

探究 'Miko: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery' 的科研主题。它们共同构成独一无二的指纹。

引用此