Abstract
Conditional variational auto-encoders (CVAEs) represent a powerful deep generative framework, utilizing latent variables (explicitly modeled hidden states) to capture underlying factors and govern the generation process accordingly. However, this idea is less explored in the era of large language models (LLMs), facing challenges in structural differences between LLMs and traditional CVAEs as well as challenges in posterior collapse (homogeneous latent variables). In this work, we present the first attempt to extend decoder-only LLMs into encoder-decoder CVAEs, aiming at enhancing existing LLMs with flexible control via low-dimensional latent vectors. To achieve this, we introduce a novel optimization objective for effective latent variable modeling and propose a gradient-only skip (G-Skip) connection, which jointly enhances generation controllability while preserving generation quality. Through experiments on AGNews, Yelp, and DailyDialog, we validate the effectiveness of our method in achieving latent modeling and latent-guided language generation on the basis of Llama3-8B. Specifically, we establish new state-of-the-art performance in dialogue generation on the DailyDialog dataset, achieving a BERTScore of 88.30 and a FED score of 5.49.
| Original language | English |
|---|---|
| Pages (from-to) | 791-805 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Artificial Intelligence |
| Volume | 7 |
| Issue number | 2 |
| DOIs | |
| State | Published - 2026 |
Keywords
- Conditional variational auto-encoders (CVAEs)
- controllable language generation
- large language models (LLMs)
Fingerprint
Dive into the research topics of 'Latent Variable Modeling for Controllable and Diverse Generation From Large Language Models'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver