Generative Adversarial Transformers (GANformer)

Category: Computer Vision

License: MIT

Model Type: Image Generation

GANformer introduces a novel transformer architecture tailored for high-resolution image synthesis. By employing a bipartite structure, it enables efficient long-range interactions across images, facilitating the generation of complex scenes with compositional representations. This model iteratively refines latent variables and visual features, promoting the emergence of detailed and diverse images. GANformer serves as an extension of the StyleGAN framework, incorporating region-based modulation through multiplicative integration.

Key Features

Dual Framework Support: Implementations available in both TensorFlow and PyTorch.
High-Resolution Generation: Capable of generating images up to 1024×2048 pixels.
Pretrained Models: Access to pretrained models for datasets like FFHQ, LSUN-Bedrooms, CLEVR, and Cityscapes.
Training & Evaluation Tools: Includes scripts for training, evaluation, and dataset preparation.
Visualization Support: Tools for generating attention maps and latent space visualizations.
Baseline Comparisons: Options to compare GANformer with models like StyleGAN2, k-GAN, and SAGAN.

GitHub