跳到主要导航 跳到搜索 跳到主要内容

LaDiffGAN: Training GANs with Diffusion Supervision in Latent Spaces

  • Xuhui Liu
  • , Bohan Zeng
  • , Sicheng Gao
  • , Shanglin Li
  • , Yutang Feng
  • , Hong Li
  • , Boyu Liu
  • , Jianzhuang Liu
  • , Baochang Zhang*
  • *此作品的通讯作者
  • Beihang University
  • Shenzhen Institute of Advanced Technology
  • Zhongguancun Laboratory
  • Nanchang Institute of Technology

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Diffusion models have recently become increasingly popular in a number of computer vision tasks, but they fail to achieve satisfactory results for unsupervised image-to-image translation, since they require massive training data and rely heavily on extra guidance. In this scenario, GANs can alleviate these issues existing in diffusion models, albeit with suboptimal quality. In this paper, we leverage the advantages of both GANs and diffusion models by training GANs with diffusion supervision in latent spaces (LaDiffGAN) to solve the unsupervised image-to-image translation task. Firstly, to promote style transfer quality, we encode the data in specific latent spaces with styles of the target and source domains. Secondly, we introduce the diffusion process with different amounts of Gaussian noise to enhance the modeling capability of GANs on the complex data distribution. We accordingly design a latent diffusion GAN loss to align the latent features between generated and training images. Lastly, we introduce a heterogeneous conditional denoising loss that incorporates image-level supervision to further improve the quality of generated results. Our LaDiffGAN significantly alleviates the drawbacks associated with diffusion models, such as data leakage, high inference cost, and high dependence on large training data sets. Extensive experiments show that LaDiffGAN outperforms previous GAN models and delivers comparable or even better performance than diffusion models.

源语言英语
主期刊名Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
出版商IEEE Computer Society
1115-1125
页数11
ISBN(电子版)9798350365474
DOI
出版状态已出版 - 2024
活动2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 - Seattle, 美国
期限: 16 6月 202422 6月 2024

出版系列

姓名IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN(印刷版)2160-7508
ISSN(电子版)2160-7516

会议

会议2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
国家/地区美国
Seattle
时期16/06/2422/06/24

指纹

探究 'LaDiffGAN: Training GANs with Diffusion Supervision in Latent Spaces' 的科研主题。它们共同构成独一无二的指纹。

引用此