跳到主要导航 跳到搜索 跳到主要内容

SSEmb: A Joint Structural and Semantic Embedding Framework for Mathematical Formula Retrieval

  • Beihang University
  • Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Formula retrieval is a core topic in Mathematical Information Retrieval. We propose SSEmb, a novel embedding framework capable of capturing both structural and semantic features of formulas. Structurally, we employ Graph Contrastive Learning to encode formulas represented as Shared-substructure Operator Graphs. To enhance structural diversity while preserving mathematical validity of these formula graphs, we introduce a novel graph data augmentation approach that leverages a substitution strategy. Semantically, we utilize Sentence-BERT to encode the surrounding text of formulas. Finally, for each query and its candidates, structural and semantic similarities are calculated separately and then fused through a weighted scheme. In the ARQMath-3 Formula Retrieval Task, SSEmb outperforms existing embedding-based methods by over 5 percentage points on P@10 and nDCG@10. Furthermore, SSEmb enhances the performance of all runs of other methods and achieves state-of-the-art results when combined with Approach0.

源语言英语
主期刊名Advances in Information Retrieval - 48th European Conference on Information Retrieval, ECIR 2026, Proceedings
编辑Ricardo Campos, Adam Jatowt, Yanyan Lan, Mohammad Aliannejadi, Christine Bauer, Sean MacAvaney, Avishek Anand, Nan Bai, Masoud Mansoury, Zhaochun Ren, Suzan Verberne
出版商Springer Science and Business Media Deutschland GmbH
282-291
页数10
ISBN(印刷版)9783032212993
DOI
出版状态已出版 - 2026
活动48th European Conference on Information Retrieval, ECIR 2026 - Delft, 荷兰
期限: 29 3月 20262 4月 2026

出版系列

姓名Lecture Notes in Computer Science
16484 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议48th European Conference on Information Retrieval, ECIR 2026
国家/地区荷兰
Delft
时期29/03/262/04/26

指纹

探究 'SSEmb: A Joint Structural and Semantic Embedding Framework for Mathematical Formula Retrieval' 的科研主题。它们共同构成独一无二的指纹。

引用此