跳到主要导航 跳到搜索 跳到主要内容

TODO: ENHANCING LLM ALIGNMENT WITH TERNARY PREFERENCES

  • Yuxiang Guo
  • , Lu Yin
  • , Bo Jiang
  • , Jiaqi Zhang*
  • *此作品的通讯作者
  • Meituan
  • Beihang University
  • University of Surrey

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Aligning large language models (LLMs) with human intent is critical for enhancing their performance across a variety of tasks. Standard alignment techniques, such as Direct Preference Optimization (DPO), often rely on the binary Bradley-Terry (BT) model, which can struggle to capture the complexities of human preferences-particularly in the presence of noisy or inconsistent labels and frequent ties. To address these limitations, we introduce the Tie-rank Oriented BradleyTerry model (TOBT), an extension of the BT model that explicitly incorporates ties, enabling more nuanced preference representation. Building on this, we propose Tie-rank Oriented Direct Preference Optimization (TODO), a novel alignment algorithm that leverages TOBT's ternary ranking system to improve preference alignment. In evaluations on Mistral-7B and Llama 3-8B models, TODO consistently outperforms DPO in modeling preferences across both in-distribution and out-of-distribution datasets. Additional assessments using MT Bench and benchmarks such as Piqa, ARC-c, and MMLU further demonstrate TODO's superior alignment performance. Notably, TODO also shows strong results in binary preference alignment, highlighting its versatility and potential for broader integration into LLM alignment. The implementation details and datasets can be found in https://github.com/XXares/TODO.

源语言英语
主期刊名13th International Conference on Learning Representations, ICLR 2025
出版商International Conference on Learning Representations, ICLR
86896-86917
页数22
ISBN(电子版)9798331320850
出版状态已出版 - 2025
活动13th International Conference on Learning Representations, ICLR 2025 - Singapore, 新加坡
期限: 24 4月 202528 4月 2025

出版系列

姓名13th International Conference on Learning Representations, ICLR 2025

会议

会议13th International Conference on Learning Representations, ICLR 2025
国家/地区新加坡
Singapore
时期24/04/2528/04/25

指纹

探究 'TODO: ENHANCING LLM ALIGNMENT WITH TERNARY PREFERENCES' 的科研主题。它们共同构成独一无二的指纹。

引用此