跳到主要导航 跳到搜索 跳到主要内容

Qwen2.5-xCoder: Multi-Agent Collaboration for Multilingual Code Instruction Tuning

  • Jian Yang
  • , Wei Zhang
  • , Jiaxi Yang
  • , Yibo Miao
  • , Shanghaoran Quan
  • , Zhenhe Wu
  • , Qiyao Peng*
  • , Liqun Yang
  • , Tianyu Liu
  • , Zeyu Cui
  • , Binyuan Hui
  • , Junyang Lin
  • *此作品的通讯作者
  • Alibaba Group Holding Ltd.
  • Beihang University
  • Tianjin University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Recent advancement in code understanding and generation demonstrates that code LLMs fine-tuned on a high-quality instruction dataset can gain powerful capabilities to address wide-ranging code-related tasks. However, most previous existing methods mainly view each programming language in isolation and ignore the knowledge transfer among different programming languages. To bridge the gap among different programming languages, we introduce a novel multi-agent collaboration framework to enhance multilingual instruction tuning for code LLMs, where multiple language-specific intelligent agent components with generation memory work together to transfer knowledge from one language to another efficiently and effectively. Specifically, we first generate the language-specific instruction data from the code snippets and then provide the generated data as the seed data for language-specific agents. Multiple language-specific agents discuss and collaborate to formulate a new instruction and its corresponding solution (A new programming language or existing programming language), To further encourage the cross-lingual transfer, each agent stores its generation history as memory and then summarizes its merits and faults. Finally, the high-quality multilingual instruction data is used to encourage knowledge transfer among different programming languages to train Qwen2.5-xCoder. Experimental results on multilingual programming benchmarks demonstrate the superior performance of Qwen2.5-xCoder in sharing common knowledge, highlighting its potential to reduce the cross-lingual gap.

源语言英语
主期刊名Long Papers
编辑Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
出版商Association for Computational Linguistics (ACL)
13121-13131
页数11
ISBN(电子版)9798891762510
DOI
出版状态已出版 - 2025
活动63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, 奥地利
期限: 27 7月 20251 8月 2025

出版系列

姓名Proceedings of the Annual Meeting of the Association for Computational Linguistics
1
ISSN(印刷版)0736-587X

会议

会议63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
国家/地区奥地利
Vienna
时期27/07/251/08/25

指纹

探究 'Qwen2.5-xCoder: Multi-Agent Collaboration for Multilingual Code Instruction Tuning' 的科研主题。它们共同构成独一无二的指纹。

引用此