pi-GAN | CVPR 2021

A new GAN architecture and training procedure for 3D-aware image synthesis using periodic implicit neural representations.

ABSTRACT

We have witnessed rapid progress on 3D-aware image synthesis, leveraging recent advances in generative visual models and neural rendering. Existing approaches however fall short in two ways: first, they may lack an underlying 3D representation or rely on view-inconsistent rendering, hence synthesizing images that are not multi-view consistent; second, they often depend upon representation network architectures that are not expressive enough, and their results thus lack in image quality. We propose a novel generative model, named Periodic Implicit Generative Adversarial Networks (π-GAN or pi-GAN), for high-quality 3D-aware image synthesis. π-GAN leverages neural representations with periodic activation functions and volumetric rendering to represent scenes as view-consistent 3D representations with fine detail. The proposed approach obtains state-of-the-art results for 3D-aware image synthesis with multiple real and synthetic datasets.

FILES

 

CITATION

E. Chan*, M. Monteiro*, P. Kellnhofer, J. Wu, G. Wetzstein, pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis, CVPR 2021 (oral)

@inproceedings{Chan2021pigan,
author = {Eric Chan and Marco Monteiro and Petr Kellnhofer and Jiajun Wu and Gordon Wetzstein},
title = {pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis},
booktitle = {CVPR},
year={2021}
}

Selected examples synthesized by π-GAN with CelebA and Cats datasets.
π-GAN leverages recent advances in generative visual models and neural rendering to produce high-quality, multi-view-consistent images.
π-GAN relies on an underlying multi-view-consistent 3D representation. We can interpret the 3D representation as a mesh and extract it with the marching cubes algorithm.
The underling 3D structural representation makes π-GAN more capable of rendering views absent from the training distribution of camera poses than previous methods that lacked 3D representations or relied on black-box neural rendering. π-GAN offers explicit control over position, rotation, focal length, and other camera parameters. Despite training only on closely cropped images of Cats, π-GAN can render images at much higher or much lower magnification.
Using a trained π-GAN generator, we can perform single-view reconstruction and novel-view synthesis. After freezing the parameters of our implicit representation, we optimize for the conditioning parameters that produce a radiance field which, when rendered, best matches the target image.

Related Projects

You may also be interested in related projects focusing on neural scene representations and rendering:

  • Kellnhofer et al. Neural Lumigraph Rendering. CVPR 2021 (link)
  • Lindell et al. Automatic Integration for Fast Neural Rendering. CVPR 2021 (link)
  • Sitzmann et al. Implicit Neural Representations with Periodic Activation Functions. NeurIPS 2020 (link)
  • Sitzmann et al. MetaSDF. NeurIPS 2020 (link)
  • Sitzmann et al. Scene Representation Networks. NeurIPS 2019 (link)
  • Sitzmann et al. Deep Voxels. CVPR 2019 (link)