TY - JOUR
T1 - Patch Inverter
T2 - A Novel Block-Wise GAN Inversion Method for Arbitrary Image Resolutions
AU - Li, Yifei
AU - Xu, Mai
AU - Li, Shengxi
AU - Zhang, Jialu
AU - Guan, Zhenyu
N1 - Publisher Copyright:
© 1994-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Generative adversarial networks (GANs) have achieved remarkable progress in generating realistic images from merely small dimensions, which essentially establishes the latent generating space by rich semantics. GAN inversion thus aims at mapping real-world images back into the latent space, allowing for the access of semantics from images. However, existing GAN inversion methods can only invert images with fixed resolutions; this significantly restricts the representation capability in real-world scenarios. To address this issue, we propose to invert images by patches, thus named as patch inverter, which is the first attempt in terms of block-wise inversion for arbitrary resolutions. More specifically, we develop the padding-free operation to ensure the continuity across patches, and analyse the intrinsic mismatch within the inversion procedure. To relieve the mismatch, we propose a shifted convolution operation, which retains the continuity across image patches and simultaneously enlarges the receptive field for each convolution layer. We further propose the reciprocal loss to regularize the inverted latent codes to reside on the original latent generating space, such that the rich semantics can be maximally preserved. Experimental results have demonstrated that our patch inverter is able to accurately invert images with arbitrary resolutions, whilst representing precise and rich image semantics in real-world scenarios.
AB - Generative adversarial networks (GANs) have achieved remarkable progress in generating realistic images from merely small dimensions, which essentially establishes the latent generating space by rich semantics. GAN inversion thus aims at mapping real-world images back into the latent space, allowing for the access of semantics from images. However, existing GAN inversion methods can only invert images with fixed resolutions; this significantly restricts the representation capability in real-world scenarios. To address this issue, we propose to invert images by patches, thus named as patch inverter, which is the first attempt in terms of block-wise inversion for arbitrary resolutions. More specifically, we develop the padding-free operation to ensure the continuity across patches, and analyse the intrinsic mismatch within the inversion procedure. To relieve the mismatch, we propose a shifted convolution operation, which retains the continuity across image patches and simultaneously enlarges the receptive field for each convolution layer. We further propose the reciprocal loss to regularize the inverted latent codes to reside on the original latent generating space, such that the rich semantics can be maximally preserved. Experimental results have demonstrated that our patch inverter is able to accurately invert images with arbitrary resolutions, whilst representing precise and rich image semantics in real-world scenarios.
KW - Block-wise synthesis
KW - GAN inversion
KW - representation learning
UR - https://www.scopus.com/pages/publications/85211004122
U2 - 10.1109/LSP.2024.3506859
DO - 10.1109/LSP.2024.3506859
M3 - 文章
AN - SCOPUS:85211004122
SN - 1070-9908
VL - 32
SP - 171
EP - 175
JO - IEEE Signal Processing Letters
JF - IEEE Signal Processing Letters
ER -