GANsformer: Generative Adversarial Transformers Drew A

Overview

PWC PWC PWC

GANsformer: Generative Adversarial Transformers

Drew A. Hudson* & C. Lawrence Zitnick

*I wish to thank Christopher D. Manning for the fruitful discussions and constructive feedback in developing the Bipartite Transformer, especially when explored within the language representation area, as well as for the kind financial support that allowed this work to happen!

This is an implementation of the GANsformer model, a novel and efficient type of transformer, explored for the task of image generation. The network employs a bipartite structure that enables long-range interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. The model iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation, and can thus be seen as a generalization of the successful StyleGAN network.

Instructions for model training and data prepreation as well as pretrained models will be available soon.
Note that the code is still going through some refactoring and clean-up. Will be ready for running by end of March 3. Stay Tuned!
(Code clean-up by March 3, all instructions by March 7, pretrained networks by March 20)

Bibtex

@article{hudson2021gansformer,
  title={Generative Adversarial Transformers},
  author={Hudson, Drew A and Zitnick, C. Lawrence},
  journal={arXiv preprint},
  year={2021}
}

Architecture overview

The GANsformer consists of two networks:

  • Generator: which produces the images (x) given randomly sampled latents (z). The latent z has a shape [batch_size, component_num, latent_dim], where component_num = 1 by default (Vanilla GAN, StyleGAN) but is > 1 for the GANsformer model. We can define the latent components by splitting z along the second dimension to obtain z_1,...,z_k latent components. The generator likewise consists of two parts:

    • Mapping network: converts sampled latents from a normal distribution (z) to the intermediate space (w). A series of Feed-forward layers. The k latent components either are mapped independently from the z space to the w space or interact with each other through self-attention (optional flag).
    • Synthesis network: the intermediate latents w are used to guide the generation of new images. Images features begin from a small constant/sampled grid of 4x4, and then go through multiple layers of convolution and up-sampling until reaching the desirable resolution (e.g. 256x256). After each convolution, the image features are modulated (meaning that their variance and bias are controlled) by the intermediate latent vectors w. While in the StyleGAN model there is one global w vectors that controls all the features equally. The GANsformer uses attention so that the k latent components specialize to control different regions in the image to create it cooperatively, and therefore perform better especially in generating images depicting multi-object scenes.
    • Attention can be used in several ways
      • Simplex Attention: when attention is applied in one direction only from the latents to the image features (top-down).
      • Duplex Attention: when attention is applied in the two directions: latents to image features (top-down) and then image features back to latents (bottom-up), so that each representation informs the other iteratively.
      • Self Attention between latents: can also be used so to each direct interactions between the latents.
      • Self Attention between image features (SAGAN model): prior approaches used attention directly between the image features, but this method does not scale well due to the quadratic number of features which becomes very high for high-resolutions.
  • Discriminator: Receives and image and has to predict whether it is real or fake – originating from the dataset or the generator. The model perform multiple layers of convolution and downsampling on the image, reducing the representation's resolution gradually until making final prediction. Optionally, attention can be incorporated into the discriminator as well where it has multiple (k) aggregator variables, that use attention to adaptively collect information from the image while being processed. We observe small improvements in model performance when attention is used in the discriminator, although note that most of the gain in using attention based on our observations arises from the generator.

Codebase

This codebase builds on top of and extends the great StyleGAN2 repository by Karras et al.
The GANsformer model can also be seen as a generalization of StyleGAN: while StyleGAN has one global latent vector that control the style of all image features globally, the GANsformer has k latent vectors, that cooperate through attention to control regions within the image, and thereby better modeling images of multi-object and compositional scenes.

More documentation and instructions will be coming soon!

Comments
  • Do you have any plans to export a pytorch version?

    Do you have any plans to export a pytorch version?

    Hi, I am not too familiar with tensorflow... If there are no such plans currently, do you have quick pointers to:

    1. the GANsformer model, especially where and how you deal with the latents (based on your paper, you split the latents?)
    2. what kind of optimizers are you using? and how do you implemented it? Is it similar to what we did in NLP (warmup, etc);
    3. did you ever tried using the standard feedforward after your duplex attention layer instead of 3x3? Did it still work?

    Thanks again for your kind attention! Best,

    opened by MultiPath 12
  • Some Errors On Training

    Some Errors On Training

    Thank you for your great work. I appreciate it a lot.

    I just tried to train a model with your codes, however there are lots of undefined variables used. For example:

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L795

    It throw out undefined variable error for 'maps_in'. When I fix that with a constant, I get another error from

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L811

    again gen_mod and gen_cond are not defined. When I fix that with a constant again, I get another error which says:

    gansformer-main/gansformer-main/training/network.py", line 1127, in G_synthesis grid_poses = get_positional_embeddings(resolution_log2, pos_dim or dlatent_size, pos_type, pos_directions_num, init = pos_init, **_kwargs) TypeError: get_positional_embeddings() got an unexpected keyword argument 'label_size'

    Am i missing something or is there a problem?

    opened by yilmazkorkmz 10
  • CLEVR pretrained model gives FID 22

    CLEVR pretrained model gives FID 22

    Hi, kudos for great work!

    I've just noticed that with recommended preprocessing and evaluation, the metrics on gdrive:cityscapes work as expected (FID ~5.2), while for CLEVR exactly two same lines:

    python prepare_data.py --clevr --max-images 100000
    python run_network.py --eval --gpus 0 --expname clevr-exp --dataset clevr --pretrained-pkl gdrive:clevr-snapshot.pkl
    

    give ~22 FID, not 9.2. Can you please double-check if the provided snapshot is correct? Or am I missing smth here?

    Thanks in advance!

    opened by JanRocketMan 8
  • kernel error in generate.py

    kernel error in generate.py

    In a python 3.7, tensorflow-gpu=1.15.0 cuda 10.0 and cuddn 7.5 I get this error in generate.py (which appeared to require cuddn 7.6.5, which brings a different error (see second part). Any advice?

    ... Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file

    ........... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'FusedBiasAct' used by {{node Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct}}with these attrs: [gain=1, T=DT_FLOAT, axis=1, alpha=0, grad=0, act=1] Registered devices: [CPU, XLA_CPU, XLA_GPU] Registered kernels: device='GPU'; T in [DT_HALF] device='GPU'; T in [DT_FLOAT]

         [[Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct]]
    

    CUDNN7.6.5 error .... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] [[Gs/_Run/Gs/maps_out/_3151]] (1) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] 0 successful operations. 0 derived errors ignored.

    opened by yaseryacoob 8
  • About the Duplex attention

    About the Duplex attention

    Hi, Thanks for sharing the code!

    I have a few questions about Section 3.1.2. Duplex attention.

    1. I am confused by the notation in the section. For example, in this section, "Y=(K^{P\times d}, V^{P\times d}), where the values store the content of the Y variables (e.g. the randomly sampled latents for the case of GAN)". Does it mean that V^{P\times d} is sampled from the original variable Y? how to set the number of P in your code?

    2. "keys track the centroids of the attention-based assignments from X to Y, which can be computed as K=a_b(Y, X)", does it mean K is calculated by using the self-attention module but with (Y, X) as input? If so, how to understand “the keys track the centroid of the attention-based assignments from X to Y”? BTW, how to get the centroids?

    3. For the update rule in duplex attention, what does the a() function mean? Does it denote a self-attention module like a_b() in Section 3.1.1, where X as query, K as keys, and V as values, if so, K is calculated from another self-attention module as mentioned in question 2, so the output of a_b(Y, X) will be treated as Keys, so the update rule contains two self-attention operations? is that right? Does it mean ’Duplex‘ attention?

    4. But finally I find I may be wrong when I read the last paragraph in this section. As mentioned in this section, "to support bidirectional interaction between elements, we can chain two reciprocal simplex attentions from X to Y and from Y to X, obtaining the duplex attention" So, does it mean, first, we calculate the Y by using a simplex attention module u^a(Y, X), and then use this Y as input of u^d(X, Y) to update X? Does it mean the duplex attention module contains three self-attention operations?

    Thanks a lot! :)

    opened by AndrewChiyz 7
  • FID VQ-GAN

    FID VQ-GAN

    Thank you for open-sourcing your code :)

    I was wondering about the generally very high FID values for the VQGAN. In the VQGAN paper, they report on, e.g., FFHQ 256x256 an FID of 11.4, whereas you report 63.1... Any idea why they are so different?

    Thanks!

    opened by xl-sr 7
  • PyTorch implementation generates same image samples

    PyTorch implementation generates same image samples

    Hi, I'm getting the same output image samples (see below) when I train the PyTorch implementation on FFHQ from scratch. The only changes I made (due to some memory issues mentioned in #33) were adding --batch-gpu 1 and removing saving attention map functionality (commenting out pytorch_version/training/visualize.py lines 167-206).

    python run_network.py --train --gpus 0 --batch-gpu 1 --ganformer-default --expname ffhq-scratch --dataset ffhq 000120 000240

    opened by kwhuang88228 6
  • Metrics PR Error

    Metrics PR Error

    Dear authors,

    Thank you for your wonderful contribution!!!

    When I tried to get precision and recall values during training by adding option, --metric pr, I got the following error


    \precision_recall.py", line 179, in _evaluate feats = self._gen_feats(Gs, inception, minibatch_size, num_gpus, Gs_kwargs) NameError: name 'inception' is not defined

    So, I have changed the lines in precision_recall.py. After the modification, It seems to work. I would greatly appreciate it if you kindly review my modification.


    def _evaluate(self, Gs, Gs_kwargs, num_gpus, num_imgs, paths = None, **kwargs):

           if paths is not None: 
               # Extract features for local sample image files (paths)
    ----->  eval_features = self._paths_to_feats(paths, feat_func, minibatch_size, num_gpus, num_imgs)
           else:
               # Extract features for newly generated fake imgs
    ----->  eval_features = self._gen_feats(Gs, feature_net, minibatch_size, num_imgs, num_gpus, Gs_kwargs)
    
           # Compute precision and recall
           state = knn_precision_recall_features(ref_features = ref_features, eval_features = eval_features,
               feature_net = feature_net, nhood_sizes = [self.nhood_size], row_batch_size = self.row_batch_size,
    ----->  col_batch_size = self.row_batch_size, num_gpus = num_gpus, num_imgs = num_imgs)
           self._report_result(state.knn_precision[0], suffix = "_precision")
           self._report_result(state.knn_recall[0], suffix = "_recall")
    
    -------------------------------------------------------------------------
    
    opened by bwhwang 6
  • Memory issue when training 1024 resolution

    Memory issue when training 1024 resolution

    I'm trying to train a 1024x1024 database on a V100 GPU. I tried both the tensorflow version and the pytorch version. Despite setting batch-gpu to 1, the tensorflow version always run out of system RAM(after the first tick, system ram total 51gb), and the pytorch version alway run out of cuda memory(before the first tick).

    Here are my training settings:

    python run_network.py --train --metrics 'none' --gpus 0 --batch-gpu 1 --resolution 1024 \
     --ganformer-default --expname art1 --dataset 1024art
    

    Also, I always encounter the warning: tcmalloc: large alloc

    opened by BlueberryGin 5
  • Issues with docker

    Issues with docker

    Hi,

    I'm trying to dockerize using this image - tensorflow/tensorflow:1.14.0-gpu-py3.

    FROM tensorflow/tensorflow:1.14.0-gpu-py3
    
    ARG USER="test"
    ARG WORK_DIR="/home/$USER"
    
    WORKDIR $WORK_DIR
    
    RUN apt-get update && apt-get install build-essential
    
    RUN apt-get install ffmpeg libsm6 libxext6  -y
    
    RUN pip install --upgrade pip setuptools wheel
    
    COPY . ./
    
    RUN pip install -r requirements.txt
    
    RUN python generate.py --gpus 0 --model gdrive:bedrooms-snapshot.pkl --output-dir images --images-num 4
    

    However, I am getting this error:

    Downloading https://drive.google.com/uc?id=1-2L3iCBpP_cf6T2onf3zEQJFAAzxsQne .... done
    
    2021-04-06 08:32:44 UTC -- Setting up TensorFlow plugin 'upfirdn_2d.cu': Preprocessing... Compiling... Loading... bin_file:  /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so
    
    2021-04-06 08:32:44 UTC -- Failed!
    
    2021-04-06 08:32:44 UTC -- Traceback (most recent call last):
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 49, in <module>
    
    2021-04-06 08:32:44 UTC --     main()
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 46, in main
    
    2021-04-06 08:32:44 UTC --     run(**vars(args))
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 22, in run
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = load_networks(model)                             # Load pre-trained network
    
    2021-04-06 08:32:44 UTC --   File "/home/test/pretrained_networks.py", line 30, in load_networks
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = pickle.load(stream, encoding = "latin1")[:3]
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 306, in __setstate__
    
    2021-04-06 08:32:44 UTC --     self._init_graph()
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 159, in _init_graph
    
    2021-04-06 08:32:44 UTC --     out_expr = self._build_func(*self.input_templates, **build_kwargs)
    
    2021-04-06 08:32:44 UTC --   File "<string>", line 2371, in G_synthesis_stylegan2
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 229, in downsample_2d
    
    2021-04-06 08:32:44 UTC --     return _simple_upfirdn_2d(x, k, down=factor, pad0=(p+1)//2, pad1=p//2, data_format=data_format, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 358, in _simple_upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     y = upfirdn_2d(y, k, upx=up, upy=up, downx=down, downy=down, padx0=pad0, padx1=pad1, pady0=pad0, pady1=pad1, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 61, in upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     return impl_dict[impl](x=x, k=k, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 139, in _upfirdn_2d_cuda
    
    2021-04-06 08:32:44 UTC --     return func(x)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 162, in decorated
    
    2021-04-06 08:32:44 UTC --     return _graph_mode_decorator(f, *args, **kwargs)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 183, in _graph_mode_decorator
    
    2021-04-06 08:32:44 UTC --     result, grad_fn = f(*args)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 131, in func
    
    2021-04-06 08:32:44 UTC --     y = _get_plugin().up_fir_dn2d(x=x, k=kc, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 14, in _get_plugin
    
    2021-04-06 08:32:44 UTC --     return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/custom_ops.py", line 162, in get_plugin
    
    2021-04-06 08:32:44 UTC --     plugin = tf.load_op_library(bin_file)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    
    2021-04-06 08:32:44 UTC --     lib_handle = py_tf.TF_LoadLibrary(library_filename)
    
    2021-04-06 08:32:44 UTC -- tensorflow.python.framework.errors_impl.NotFoundError: /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
    
    2021-04-06 08:32:44 UTC -- error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1
    

    Please help to check and advise. Thanks!

    opened by arsyad-ah 5
  • Cannot utilize multiple CPU cores

    Cannot utilize multiple CPU cores

    Hi-

    Thank you for making such a fascinating project available here!

    I'm trying to run ganformer within a conda environment, but am having problems getting ganformer to utilize multiple CPU cores.

    Using Ubuntu 20.04. Here is the setup for the conda environment used:

    conda create --name cuda10 python=3.7
    conda activate cuda10
    conda install tensorflow-gpu=1.14
    conda install pillow h5py requests tqdm termcolor seaborn
    pip install opencv-python lmdb gdown easydict
    

    To run it

    python gansformer/run_network.py --train --pretrained-pkl None --gpus 0,1 --ganformer-default --expname myDS_256 --dataset myDS --data-dir /data/myDS_256_tf --keep-samples --metrics none --result-dir training_runs/256_c1/ --num-threads 24 --minibatch-size 16
    

    Everything seems to be running correctly, there are no errors or crashes. The only problem is slow training initialization and low GPU utilization during training. System Monitor shows that only one CPU core is used at a time, so I'm guessing this is the cause of both issues. Do you have any ideas of what might be causing the restriction to a single CPU core?

    I always try to avoid raising an issue when something obvious might be wrong on my end, but this is my first time using conda so it might be that I'm simply using it incorrectly, or that I'm using your program incorrectly. I appreciate your patience if that is the case.

    Thank you for your attention to this issue!

    opened by abstractdonut 4
  • question on duplex attention (k means) code

    question on duplex attention (k means) code

    First, thank you for this amazing work!

    I am suspecting that an indentation is missing at the following position of the code:

    https://github.com/dorarad/gansformer/blob/3a9efa4545be25604b70560b7f491ec3633c14a3/pytorch_version/training/networks.py#L784

    The reason why it raises my suspicion is that, if the code is executed as it is, it seems like the actual key values (to_tensor) are never involved in the computation of the attention scores when k means is enabled. If I am mistaken, would you mind explain why line 787 replaces the original attention scores with the values computed here (where the embedding "to_centroids" seems to be initialized to be a mapping of the queries)?

    opened by nintendops 0
  • Training wont work, needs tensor.contrib which was removed in tf version 1.14

    Training wont work, needs tensor.contrib which was removed in tf version 1.14

    When running: python3 run_network.py --train --ganformer-default --expname test --dataset plant --eval-images-num 10000 The following error appears:

    I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-10-11 14:56:30.661744: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2022-10-11 14:56:30.690985: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-10-11 14:56:31.202500: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202557: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202565: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Traceback (most recent call last): File "/home/ali/gansformer/run_network.py", line 15, in import pretrained_networks File "/home/ali/gansformer/pretrained_networks.py", line 4, in import dnnlib.tflib as tflib File "/home/ali/gansformer/dnnlib/tflib/init.py", line 1, in from . import autosummary File "/home/ali/gansformer/dnnlib/tflib/autosummary.py", line 23, in from . import tfutil File "/home/ali/gansformer/dnnlib/tflib/tfutil.py", line 9, in import tensorflow.contrib # requires TensorFlow 1.x! ModuleNotFoundError: No module named 'tensorflow.contrib'

    opened by AliMezher18 0
  • Hosting models on Hugging Face

    Hosting models on Hugging Face

    Hello! Thank you for open-sourcing this work, this is amazing 😊 I was wondering if you'd be interested in mirroring the pretrained model weights over on the Hugging Face model hub. I'm sure our community would love to see your work, and (among other things) hosting checkpoints on the Hub helps a lot with discoverability. We've got a guide here on how to upload models, but I'm also happy to help out with it if you'd like!

    opened by NimaBoscarino 0
  • Ganformer2

    Ganformer2

    Thanks for your brilliant work of ganformer and ganformer2! May I ask is there a roughly timeline to when the ganformer2 model would be release? Thanks for your time!

    opened by yangkang98 0
Releases(v1.5.2)
  • v1.5.2(Feb 2, 2022)

    Official Implementation of the Generative Adversarial Transformers paper, in both pytorch and tensorflow, for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Updates for version 1.5.2 (Feb 22, 2022): We updated the weight initialization of the PyTorch version to the intended scale, leading to a substantial improvement in the model's learning speed.

    Source code(tar.gz)
    Source code(zip)
  • v1.0(Mar 17, 2021)

    Official Implementation of the Generative Adversarial Transformers paper for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Source code(tar.gz)
    Source code(zip)
Owner
Drew Arad Hudson
Drew Arad Hudson
This repo generates the training data and the model for Morpheus-Deblend

Morpheus-Deblend This repo generates the training data and the model for Morpheus-Deblend. This is the active development repo for the project and as

Ryan Hausen 2 Apr 18, 2022
Reproducing Results from A Hybrid Approach to Targeting Social Assistance

title author date output Reproducing Results from A Hybrid Approach to Targeting Social Assistance Lendie Follett and Heath Henderson 12/28/2021 html_

Lendie Follett 0 Jan 06, 2022
Official PyTorch implementation of Learning Intra-Batch Connections for Deep Metric Learning (ICML 2021) published at International Conference on Machine Learning

About This repository the official PyTorch implementation of Learning Intra-Batch Connections for Deep Metric Learning. The config files contain the s

Dynamic Vision and Learning Group 41 Dec 10, 2022
Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation

Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation Introduction This is a PyTorch

XMed-Lab 30 Sep 23, 2022
Cervix ROI Segmentation Using U-NET

Cervix ROI Segmentation Using U-NET Overview This code illustrate how to segment the ROI in cervical images using U-NET. The ROI here meant to include

Scotty Kwok 35 Sep 14, 2022
LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This project is based on ultralytics/yolov3. LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image. The related paper is avai

26 Dec 13, 2022
Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Keyword Spotting Transformer This is the unofficial TensorFlow implementation of the Keyword Spotting Transformer model. This model is used to train o

Intelligent Machines Limited 8 May 11, 2022
This repository contains the code to replicate the analysis from the paper "Moving On - Investigating Inventors' Ethnic Origins Using Supervised Learning"

Replication Code for 'Moving On' - Investigating Inventors' Ethnic Origins Using Supervised Learning This repository contains the code to replicate th

Matthias Niggli 0 Jan 04, 2022
Deep Hedging Demo - An Example of Using Machine Learning for Derivative Pricing.

Deep Hedging Demo Pricing Derivatives using Machine Learning 1) Jupyter version: Run ./colab/deep_hedging_colab.ipynb on Colab. 2) Gui version: Run py

Yu Man Tam 102 Jan 06, 2023
This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

Yogi-Optimizer_Keras This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras) The NeurIPS-Paper can be found here: http://papers.nips.c

14 Sep 13, 2022
Implementation of SiameseXML (ICML 2021)

SiameseXML Code for SiameseXML: Siamese networks meet extreme classifiers with 100M labels Best Practices for features creation Adding sub-words on to

Extreme Classification 35 Nov 06, 2022
Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

Configurations Change HOME_PATH in CONFIG.py as the current path Data Prepare CENSINCOME Download data Put census-income.data and census-income.test i

2 Aug 14, 2022
Super Resolution for images using deep learning.

Neural Enhance Example #1 — Old Station: view comparison in 24-bit HD, original photo CC-BY-SA @siv-athens. As seen on TV! What if you could increase

Alex J. Champandard 11.7k Dec 29, 2022
STEM: An approach to Multi-source Domain Adaptation with Guarantees

STEM: An approach to Multi-source Domain Adaptation with Guarantees Introduction This is the official implementation of ``STEM: An approach to Multi-s

5 Dec 19, 2022
Encode and decode text application

Text Encoder and Decoder Encode and decode text in many ways using this application! Encode in: ASCII85 Base85 Base64 Base32 Base16 Url MD5 Hash SHA-1

Alice 1 Feb 12, 2022
The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

9 Dec 09, 2022
Repository for Multimodal AutoML Benchmark

Benchmarking Multimodal AutoML for Tabular Data with Text Fields Repository for the NeurIPS 2021 Dataset Track Submission "Benchmarking Multimodal Aut

Xingjian Shi 44 Nov 24, 2022
This is a repository with the code for the ACL 2019 paper

The Story of Heads This is the official repo for the following papers: (ACL 2019) Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy

231 Nov 15, 2022
Lucid library adapted for PyTorch

Lucent PyTorch + Lucid = Lucent The wonderful Lucid library adapted for the wonderful PyTorch! Lucent is not affiliated with Lucid or OpenAI's Clarity

Lim Swee Kiat 520 Dec 26, 2022
A simple Tensorflow based library for deep and/or denoising AutoEncoder.

libsdae - deep-Autoencoder & denoising autoencoder A simple Tensorflow based library for Deep autoencoder and denoising AE. Library follows sklearn st

Rajarshee Mitra 147 Nov 18, 2022