U-Net conditional GANs for photo-realistic and identity-preserving facial expression synthesis

Research output: Contribution to journalArticlepeer-review

Abstract

Facial expression synthesis (FES) is a challenging task since the expression changes are highly non-linear and depend on the facial appearance. Person identity should also be well preserved in the synthesized face. In this article, we present a novel U-Net Conditional Generative Adversarial Network for FES. U-Net helps retain the property of the input face, including the identity information and facial details. Category condition is added to the U-Net model so that one-to-many expression synthesis can be achieved simultaneously.We also design constraints for identity preservation during FES to further guarantee that the identity of the input face can be well preserved in the generated face image. Specifically, we pair the generated output with condition image of other identities for the discriminator, so as to encourage it to learn the distinctions between the synthesized and natural images, as well as between input and other identities, which can help improve its discriminating ability. Additionally, we utilize the triplet loss to maintain the generated face images closer to the same identity person by imposing a margin between the positive pairs and negative pairs in feature space. Both qualitative and quantitative evaluations are conducted on the Oulu-CASIA NIR&VIS facial expression database, the Radboud Faces Database, and the Karolinska Directed Emotional Faces database, and the experimental results showthat our method can generate faces with natural and realistic expressionswhile preserving identity information.

Original languageEnglish
Article numberA88
JournalACM Transactions on Multimedia Computing, Communications and Applications
Volume15
Issue number3s
DOIs
StatePublished - Oct 2019

Keywords

  • Facial expression synthesis
  • Generative adversarial networks (GANs)
  • Identity preserving

Fingerprint

Dive into the research topics of 'U-Net conditional GANs for photo-realistic and identity-preserving facial expression synthesis'. Together they form a unique fingerprint.

Cite this