Abstract
Recent advancements in large language models (LLMs) have broadened their application scope but revealed challenges in balancing capabilities across general knowledge, coding, and mathematics. To address this, we introduce a Collaborative and Semantic Experts (CoE) approach for supervised fine-tuning (SFT), which employs a two-phase training strategy. Initially, expert training fine-tunes the feed-forward network on specialized datasets, developing distinct experts in targeted domains. Subsequently, expert leveraging synthesizes these trained experts into a structured model with semantic guidance to activate specific experts, enhancing performance and interpretability. Evaluations on comprehensive benchmarks across MMLU, HumanEval, GSM8K, MT-Bench, and AlpacaEval confirm CoE’s efficacy, demonstrating improved performance and expert collaboration in diverse tasks, significantly outperforming traditional SFT methods.
| Original language | English |
|---|---|
| Pages (from-to) | 25624-25632 |
| Number of pages | 9 |
| Journal | Proceedings of the AAAI Conference on Artificial Intelligence |
| Volume | 39 |
| Issue number | 24 |
| DOIs | |
| State | Published - 11 Apr 2025 |
| Externally published | Yes |
| Event | 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States Duration: 25 Feb 2025 → 4 Mar 2025 |
Fingerprint
Dive into the research topics of 'Fine-Tuning Language Models with Collaborative and Semantic Experts'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver