TY - JOUR
T1 - A label-anchored variational framework for air crisis event multi-modal recognition with missing modality
AU - Zhang, Yishan
AU - Yang, Yang
AU - Zhang, Shengjie
AU - Qian, Shengsheng
AU - Xu, Yan
AU - Cai, Kaiquan
N1 - Publisher Copyright:
© 2026 Elsevier Ltd.
PY - 2026/7
Y1 - 2026/7
N2 - Air crisis event recognition from massive multi-modal social media data offers a promising avenue for enhancing emergency response efficiency, reducing crisis management costs, and uncovering critical situational insights. However, incomplete text-image pairs pose significant challenges, as the chaotic and damaged scenes typical of air accidents often hinder the collection of comprehensive information. To address this issue, we propose a label-anchored variational framework that shifts the paradigm from fusion-centric compensation to completion-driven discrimination. Specifically, we propose a general multi-modal recognition scheme for air crisis events with missing modalities that resorts to a Unimodal Knowledge Completion Variational Autoencoder (UKC-VAE) model. First, two separate VAE-based unimodal parallel encoders are presented to generate class-discriminative latent variables through topic-specific label embeddings acting as lightweight class anchors. Moreover, a contrastive learning-based semantic alignment module and a distribution alignment module are proposed to enhance the cross-modal knowledge transfer and ensure consistency across modalities. Extensive experiments demonstrate the superior performance of the proposed UKC-VAE model compared to several state-of-the-art baselines on the AirCrisisMMD and CrisisMMD datasets. The former is a new specialized multi-modal dataset that will be released soon.
AB - Air crisis event recognition from massive multi-modal social media data offers a promising avenue for enhancing emergency response efficiency, reducing crisis management costs, and uncovering critical situational insights. However, incomplete text-image pairs pose significant challenges, as the chaotic and damaged scenes typical of air accidents often hinder the collection of comprehensive information. To address this issue, we propose a label-anchored variational framework that shifts the paradigm from fusion-centric compensation to completion-driven discrimination. Specifically, we propose a general multi-modal recognition scheme for air crisis events with missing modalities that resorts to a Unimodal Knowledge Completion Variational Autoencoder (UKC-VAE) model. First, two separate VAE-based unimodal parallel encoders are presented to generate class-discriminative latent variables through topic-specific label embeddings acting as lightweight class anchors. Moreover, a contrastive learning-based semantic alignment module and a distribution alignment module are proposed to enhance the cross-modal knowledge transfer and ensure consistency across modalities. Extensive experiments demonstrate the superior performance of the proposed UKC-VAE model compared to several state-of-the-art baselines on the AirCrisisMMD and CrisisMMD datasets. The former is a new specialized multi-modal dataset that will be released soon.
KW - Accident investigation
KW - Artificial intelligence
KW - Aviation safety
KW - Incomplete data learning
KW - Multi-modal classification
UR - https://www.scopus.com/pages/publications/105032037968
U2 - 10.1016/j.aei.2026.104564
DO - 10.1016/j.aei.2026.104564
M3 - 文章
AN - SCOPUS:105032037968
SN - 1474-0346
VL - 73
JO - Advanced Engineering Informatics
JF - Advanced Engineering Informatics
M1 - 104564
ER -