跳到主要导航 跳到搜索 跳到主要内容

Aligning Large Language Models with Implicit Preferences from User-Generated Content

  • Zhaoxuan Tan*
  • , Zheng Li
  • , Tianyi Liu
  • , Haodong Wang
  • , Hyokun Yun
  • , Ming Zeng
  • , Pei Chen
  • , Zhihan Zhang
  • , Yifan Gao
  • , Ruijie Wang
  • , Priyanka Nigam
  • , Bing Yin
  • , Meng Jiang
  • *此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Learning from preference feedback is essential for aligning large language models (LLMs) with human values and improving the quality of generated responses. However, existing preference learning methods rely heavily on curated data from humans or advanced LLMs, which is costly and difficult to scale. In this work, we present PUGC, a novel framework that leverages implicit human Preferences in unlabeled User-Generated Content (UGC) to generate preference data. Although UGC is not explicitly created to guide LLMs in generating human-preferred responses, it often reflects valuable insights and implicit preferences from its creators that has the potential to address readers' questions. PUGC transforms UGC into user queries and generates responses from the policy model. The UGC is then leveraged as a reference text for response scoring, aligning the model with these implicit preferences. This approach improves the quality of preference data while enabling scalable, domain-specific alignment. Experimental results on Alpaca Eval 2 show that models trained with DPO and PUGC achieve a 9.37% performance improvement over traditional methods, setting a 35.93% state-of-the-art length-controlled win rate using Mistral-7B-Instruct. Further studies highlight gains in reward quality, domain-specific alignment effectiveness, robustness against UGC quality, and theory of mind capabilities. Our code and dataset are available at https://zhaoxuan.info/PUGC.github.io/.

源语言英语
主期刊名Long Papers
编辑Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
出版商Association for Computational Linguistics (ACL)
7792-7820
页数29
ISBN(电子版)9798891762510
DOI
出版状态已出版 - 2025
已对外发布
活动63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, 奥地利
期限: 27 7月 20251 8月 2025

出版系列

姓名Proceedings of the Annual Meeting of the Association for Computational Linguistics
1
ISSN(印刷版)0736-587X

会议

会议63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
国家/地区奥地利
Vienna
时期27/07/251/08/25

指纹

探究 'Aligning Large Language Models with Implicit Preferences from User-Generated Content' 的科研主题。它们共同构成独一无二的指纹。

引用此