Generative Adversarial Text-to-Image Synthesis

Last update: Dec 31, 2022

Related tags

Deep Learning icml2016

Overview

###Generative Adversarial Text-to-Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee

This is the code for our ICML 2016 paper on text-to-image synthesis using conditional GANs. You can use it to train and sample from text-to-image models. The code is adapted from the excellent dcgan.torch.

####Setup Instructions

You will need to install Torch, CuDNN, and the display package.

####How to train a text to image model:

Download the birds and flowers and COCO caption data in Torch format.
Download the birds and flowers and COCO image data.
Download the text encoders for birds and flowers and COCO descriptions.
Modify the CONFIG file to point to your data and text encoder paths.
Run one of the training scripts, e.g. ./scripts/train_cub.sh

####How to generate samples:

For flowers: ./scripts/demo_flowers.sh. Add text descriptions to scripts/flowers_queries.txt.
For birds: ./scripts/demo_cub.sh.
For COCO (more general images): ./scripts/demo_coco.sh.
An html file will be generated with the results:

####Pretrained models:

####How to train a text encoder from scratch:

You may want to do this if you have your own new dataset of text descriptions.
For flowers and birds: follow the instructions here.
For MS-COCO: ./scripts/train_coco_txt.sh.

####Citation

If you find this useful, please cite our work as follows:

@inproceedings{reed2016generative,
  title={Generative Adversarial Text-to-Image Synthesis},
  author={Scott Reed and Zeynep Akata and Xinchen Yan and Lajanugen Logeswaran and Bernt Schiele and Honglak Lee},
  booktitle={Proceedings of The 33rd International Conference on Machine Learning},
  year={2016}
}

Generative Adversarial Text-to-Image Synthesis

Related tags

Overview

Owner

Scott Ellison Reed

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

[CVPR 2020] 3D Photography using Context-aware Layered Depth Inpainting

A 1.3B text-to-image generation model trained on 14 million image-text pairs

The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

VGGFace2-HQ - A high resolution face dataset for face editing purpose

Stacked Hourglass Network with a Multi-level Attention Mechanism: Where to Look for Intervertebral Disc Labeling

Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Scan-Dataset

DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

MutualGuide is a compact object detector specially designed for embedded devices

Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Playable Video Generation

Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"

Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

Code to accompany our paper "Continual Learning Through Synaptic Intelligence" ICML 2017

codes for Self-paced Deep Regression Forests with Consideration on Ranking Fairness

Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC.

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding