TY - GEN
T1 - Report-Guided Cross-Modal Representation Learning for Predicting EGFR Mutations by Whole Slide Image
AU - Qiao, Qi
AU - Shi, Jun
AU - Jiang, Zhiguo
AU - Wang, Wei
AU - Wu, Haibo
AU - Zheng, Yushan
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Traditional PCR/NGS-based multigene panel testing is time-consuming and costly. Predicting EGFR mutations directly from H&E stained whole slide images (WSIs) can alleviate these limitations. Furthermore, histopathological reports contain valuable textual information that correlates with tissue areas in WSIs. However, recent research mainly analyses EGFR mutation status only from a single modality, ignoring rich information contained in reports. In this paper, we propose a report-guided cross-modal representation learning method for predicting EGFR mutations by WSIs. Specifically, we reconstruct report-level embeddings through exploring intrinsic relationships between diagnostic words in histopathological reports and tissue areas in WSIs. Finally, reconstructed histopathological report embedding and aggregated WSI embedding are fused for final prediction. More importantly, molecular testing report is also introduced as prior supervision information at the training stage to guarantee semantic consistency of fused feature and molecular report embedding. We evaluate our method on the TCGA-EGFR public benchmark dataset and an in-house clinical dataset (USTC-EGFR). Experimental results demonstrate that our method outperforms existing approaches in EGFR mutation prediction, highlighting the benefits of cross-modal learning in enhancing feature representational ability. The code is available at https://github.com/HFUT-miaLab/RCRL.
AB - Traditional PCR/NGS-based multigene panel testing is time-consuming and costly. Predicting EGFR mutations directly from H&E stained whole slide images (WSIs) can alleviate these limitations. Furthermore, histopathological reports contain valuable textual information that correlates with tissue areas in WSIs. However, recent research mainly analyses EGFR mutation status only from a single modality, ignoring rich information contained in reports. In this paper, we propose a report-guided cross-modal representation learning method for predicting EGFR mutations by WSIs. Specifically, we reconstruct report-level embeddings through exploring intrinsic relationships between diagnostic words in histopathological reports and tissue areas in WSIs. Finally, reconstructed histopathological report embedding and aggregated WSI embedding are fused for final prediction. More importantly, molecular testing report is also introduced as prior supervision information at the training stage to guarantee semantic consistency of fused feature and molecular report embedding. We evaluate our method on the TCGA-EGFR public benchmark dataset and an in-house clinical dataset (USTC-EGFR). Experimental results demonstrate that our method outperforms existing approaches in EGFR mutation prediction, highlighting the benefits of cross-modal learning in enhancing feature representational ability. The code is available at https://github.com/HFUT-miaLab/RCRL.
KW - EGFR mutation prediction
KW - Multi-modal
KW - Weakly supervised learning
KW - Whole silde image
UR - https://www.scopus.com/pages/publications/85217278755
U2 - 10.1109/BIBM62325.2024.10822285
DO - 10.1109/BIBM62325.2024.10822285
M3 - 会议稿件
AN - SCOPUS:85217278755
T3 - Proceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
SP - 3651
EP - 3654
BT - Proceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
A2 - Cannataro, Mario
A2 - Zheng, Huiru
A2 - Gao, Lin
A2 - Cheng, Jianlin
A2 - de Miranda, Joao Luis
A2 - Zumpano, Ester
A2 - Hu, Xiaohua
A2 - Cho, Young-Rae
A2 - Park, Taesung
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
Y2 - 3 December 2024 through 6 December 2024
ER -