Skip to main navigation Skip to search Skip to main content

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models

  • Zekun Moore Wang
  • , Zhongyuan Peng
  • , Haoran Que
  • , Jiaheng Liu*
  • , Wangchunshu Zhou
  • , Yuhan Wu
  • , Hongcheng Guo
  • , Ruitong Gan
  • , Zehao Ni
  • , Jian Yang
  • , Man Zhang
  • , Zhaoxiang Zhang*
  • , Wanli Ouyang
  • , Ke Xu
  • , Stephen W. Huang
  • , Jie Fu
  • , Junran Peng
  • *Corresponding author for this work
  • Beihang University
  • Hong Kong University of Science and Technology
  • University of Chinese Academy of Sciences
  • Swiss Federal Institute of Technology Zurich
  • Beijing University of Posts and Telecommunications
  • Hong Kong Polytechnic University
  • CAS - Institute of Automation
  • Shanghai Artificial Intelligence Laboratory
  • Harmony.ai

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4).

Original languageEnglish
Title of host publicationThe 62nd Annual Meeting of the Association for Computational Linguistics
Subtitle of host publicationFindings of the Association for Computational Linguistics, ACL 2024
EditorsLun-Wei Ku, Andre Martins, Vivek Srikumar
PublisherAssociation for Computational Linguistics (ACL)
Pages14743-14777
Number of pages35
ISBN (Electronic)9798891760998
DOIs
StatePublished - 2024
EventFindings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Hybrid, Bangkok, Thailand
Duration: 11 Aug 202416 Aug 2024

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

ConferenceFindings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Country/TerritoryThailand
CityHybrid, Bangkok
Period11/08/2416/08/24

Fingerprint

Dive into the research topics of 'RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models'. Together they form a unique fingerprint.

Cite this