This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Last update: Jan 05, 2023

Related tags

Overview

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video]

Paper Authors: Yifan Jiang, Shiyu Chang, Zhangyang Wang

CVPR 2021

This is re-implementation of TransGAN: Two Transformers Can Make One Strong GAN, and That Can Scale Up, CVPR 2021 in PyTorch.

Generative Adversarial Networks-GAN builded completely free of Convolutions and used Transformers architectures which became popular since Vision Transformers-ViT. In this implementation, CIFAR-10 dataset was used.

0 Epoch	40 Epoch	100 Epoch	200 Epoch

Related Work - Vision Transformers (ViT)

In this implementation, as a discriminator, Vision Transformer(ViT) Block was used. In order to get more info about ViT, you can look at the original paper here

Credits for illustration of ViT: @lucidrains

Installation

Before running train.py, check whether you have libraries in requirements.txt! Also, create ./fid_stat folder and download the fid_stats_cifar10_train.npz file in this folder. To save your model during training, create ./checkpoint folder using mkdir checkpoint.

Training

python train.py

Pretrained Model

You can find pretrained model here. You can download using:

wget https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

curl gdrive.sh | bash -s https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

License

MIT

Citation

@article{jiang2021transgan,
  title={TransGAN: Two Transformers Can Make One Strong GAN},
  author={Jiang, Yifan and Chang, Shiyu and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2102.07074},
  year={2021}
}

@article{dosovitskiy2020,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}

@inproceedings{zhao2020diffaugment,
  title={Differentiable Augmentation for Data-Efficient GAN Training},
  author={Zhao, Shengyu and Liu, Zhijian and Lin, Ji and Zhu, Jun-Yan and Han, Song},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

Comments

GPU memory, Modifying batch size
Hello,

I saw your comment in VITA-Group's implementation of TransGAN and started looking at your implementation here.

Without modifying anything and attempting to run "python train.py" results in CUDA out of memory; I believe the GPU I'm using cannot handle the model size/training images that you've specified. I tried editing the batch size on lines 35 and 36 of train.py (--gener_batch_size, changing default from 64 to 32, etc.), but I get a RuntimeError of:

Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such fuctions do not allow the otutput views to be modified inplace. You should replace the inplace operation by an out-of-place one.

My two questions are:

How would you suggest modifying the training parameters to deal with GPU running out of memory? and,

Is there a better way to edit the batch size, and what else do I need to change in order for the code to not break when the batch size is changed?

Thanks!
opened by Andrew-X-Wang 10
Create your own FID stats file

Hello and thanks for the implementation. I'm trying to train this model on a different datset, but to do so I need a custom fid_stats file for my dataset. How can I create it ?

opened by IlyasMoutawwakil 2
FID score: nan

Thank you for your contribution. But in the training processing, FID score is Nan. I want to known whether it is appropriate. Should I make some chance to solve this problem?

opened by Jamie-Cheung 1
TransGAN fid problem

hello,I would like to humbly ask you what is the difference beetween TransGAN-main and TransGAN-master?can Trans-main reproduce similar results of the original paper? The results obtained by using CIFAR in TransGAN-main are quite different from those in the paper,and WGAN-EP loss concussion,so I want to ask you.

opened by Stephenlove 1
How do you test on your own dataset with the checkpoint.pth generated?

I want to use the checkpoint saved to generate my own results from a testing dataset and use those images later to calculate my own evaluation metrics. Please help

opened by meh-naz 0

Releases(v2.0)

v2.0(Jul 6, 2021)

More qualified generated images with TransGAN on CIFAR10 dataset.
Source code(tar.gz)
Source code(zip)
v1.0(May 31, 2021)

In this version of re-implementation, MNIST and CIFAR-10 datasets were used for TransGAN-S.
Source code(tar.gz)
Source code(zip)

Owner

Ahmet Sarigun

Yet, another human being!

GitHub Repository https://arxiv.org/abs/2102.07074

School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

F-Principle This is an exercise problem of the digital signal processing (DSP) course at School of Artificial Intelligence at the Nanjing University (

5 Nov 23, 2022

This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Related tags

Overview

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video]

Related Work - Vision Transformers (ViT)

Installation

Training

Pretrained Model

License

Citation

Comments

GPU memory, Modifying batch size

Create your own FID stats file

FID score: nan

TransGAN fid problem

How do you test on your own dataset with the checkpoint.pth generated?

Releases(v2.0)

v2.0(Jul 6, 2021)

v1.0(May 31, 2021)

Owner

Ahmet Sarigun

School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".

Model-based reinforcement learning in TensorFlow

PyTorch implementation of "Dataset Knowledge Transfer for Class-Incremental Learning Without Memory" (WACV2022)

Automatic labeling, conversion of different data set formats, sample size statistics, model cascade

Campsite Reservation Finder

Attentive Implicit Representation Networks (AIR-Nets)

PyTorch implementation(s) of various ResNet models from Twitch streams.

PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

Studying Python release adoptions by looking at PyPI downloads

Pytorch implementation of U-Net, R2U-Net, Attention U-Net, and Attention R2U-Net.

Revisiting Self-Training for Few-Shot Learning of Language Model.

DANet for Tabular data classification/ regression.

Training vision models with full-batch gradient descent and regularization

A machine learning project which can detect and predict the skin disease through image recognition.

Example of a Quantum LSTM

This application explain how we can easily integrate Deepface framework with Python Django application

LyaNet: A Lyapunov Framework for Training Neural ODEs

Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode