Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

Overview

GSL - Zero-shot Synthesis with Group-Supervised Learning

image Figure: Zero-shot synthesis performance of our method with different dataset (iLab-20M, RaFD, and Fonts). Bottom: training images (attributes are known). Top: Test image (attributes are a query).

Zero-shot Synthesis with Group-Supervised Learning
Yunhao Ge, Sami Abu-El-Haija, Gan Xin, Laurent Itti
International Conference on Learning Representations (ICLR), 2021

[Paper] [Project Page] [Fonts dataset]

To aid neural networks to envision objects with different attributes, we propose a family of objective functions, expressed on groups of examples, as a novel learning framework that we term Group-Supervised Learning (GSL). GSL allows us to decompose inputs into a disentangled representation with swappable components, that can be recombined to synthesize new samples. (i.e., images of red boats & blue cars can be decomposed and recombined to synthesize novel images of red cars.

[We are actively updating the code]

Getting Started

Installation

  • Dependencies
python 3.6.4
pytorch 0.3.1.post2
visdom
tqdm

  • Clone this repo:
git clone https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
cd pytorch-CycleGAN-and-pix2pix

Datasets

  • iLab-20M, is an attributed dataset containing images of toy vehicles placed on a turntable using 11 cameras at different viewing points. There are 3 attribute classes: vehicle identity: 15 categories, each having 25-160 instances; pose; and backgrounds: over 14 for each identity: projecting vehicles in relevant contexts. You can download a subset of iLab-20M that we used in our paper here: iLab-6pose [http://ilab.usc.edu/datasets/iLab_6pose.zip]

  • Fonts, is a computer-generated RGB image datasets. Each image, with 128 * 128 pixels, contains an alphabet letter rendered using 5 independent generating attributes: letter identity, size, font color, background color and font. you can download the fonts dataset at here: Fonts [http://ilab.usc.edu/datasets/fonts].

  • RaFD contains pictures of 67 models displaying 8 emotional expressions taken by 5 different camera angles simultaneously. There are 3 attributes: identity, camera position (pose), and expression. To download the RaFD dataset, you must request access to the dataset from the Radboud Faces Database website.

  • dSprites, is a dataset of 2D shapes procedurally generated from 6 ground truth independent latent factors. These factors are color, shape, scale, rotation, x and y positions of a sprite. you can download the dSprites dataset here dSprites

Datasets Preprocess

To efficiently access the dataset in a manner of Group-Supervised Learning, some dataset need preprocess.

  • For iLab-20M dataset, after downloading iLab-6pose subset, please run python3 ./utils/ilab_data_preprocess.py
  • For RaFD dataset, after downloading, please run python3 ./utils/rafd_data_preprocess.py
  • For desprites dataset, after downloading, please run python3 ./utils/desprites_data_preprocess.py
  • For Fonts dataset, no preprocess needed.

After preprocess, please update the dataset path in '--dataset_path' parameter

Synthesis with pretrained model

You can download the pretrained models of ilab-20M, Fonts, RaFD and dsprites here pretrained models (http://ilab.usc.edu/datasets/GSL_pretrained_models.zip) and put them to ./checkpoints/pretrained_models The sample test images are in the ./checkpoints/test_imgs You can use the following sample commands to synthesize zero-shot images with our pretrained models:

  • For Fonts
python3 main.py --train False --dataset Fonts --pretrain_model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --test_img_path './checkpoints/test_imgs/fonts' --viz_name fonts
  • For iLab-20M
python3 main.py --train False --dataset ilab-20M --pretrain_model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --test_img_path './checkpoints/test_imgs/ilab_20M' --viz_name ilab-20m
  • For RaFD
python3 main.py --train False --dataset RaFD --pretrain_model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --test_img_path './checkpoints/test_imgs/rafd' --viz_name rafd
  • For dsprites
python3 main.py --train False --dataset dsprites--pretrain_model_path YOUR_LOCAL_PATH_OF_PRETRAINED_MODEL --test_img_path './checkpoints/test_imgs/dsprites' --viz_name dsprites

Train GZS-Net on datasets used in paper

Group-Supervised Zero-shot Synthesis Network (GZS-Net) is an implemetation of Group-Supervised Learning with only reconstruction loss. If you want to train GZS-Net with the 4 datasets used in paper (Fonts, iLab-20M, RaFD, dSprites), please use 'train.py' with the dataset name, dataset path and visualize pannel name in Visdom. Note: you can also set the hyperparameter of lr, batchsize, backbone structure in train.py Here are some examples:

  • For Fonts
python3 main.py --train True --dataset Fonts --dataset_path YOUR_LOCAL_PATH_OF_FONTS --viz_name fonts
  • For iLab-20M
python3 main.py --train True --dataset ilab-20M --dataset_path YOUR_LOCAL_PATH_OF_ILAB --viz_name ilab-20m
  • For RaFD
python3 main.py --train True --dataset RaFD --dataset_path YOUR_LOCAL_PATH_OF_RaFD --viz_name rafd
  • For dsprites
python3 main.py --train True --dataset dsprites--dataset_path YOUR_LOCAL_PATH_OF_DSPRITES --viz_name dsprites

Train GZS-Net on your own dataset

To use our GZS-Net on you own dataset, before training, please refer the admissible dataset description in our paper. Note: The high level training strategy of the 4 dataset that paper used (Fonts, iLab-20M, RaFD, dSprites) is shown in Figure.3 in our paper. However, to make our method more general and compatale with more dataset, we propose a easier way to train our GZS-Net, we called 'sample edge strategy' to achieve 'One-Overlap Attribute Swap': In each training step, we sample n different edges (each edge corresponding to a specific attribute), and we release the two requirement of edge sample: (1) the two samples connected by an edge with attribute A should have same attribute A value but do not need to have different attribute values of other attributes (e.g. attribute B and C value can be the same). (2) we do not need center image x to keep showing in all edges, which means the connected images between edges can be totally different.

We train ilab-20M with the new training strategy and you can cgange our example code of ilab_20M_custom to your custom dataset.

  • Take ilab_20M_custom dataset as an example
python3 train.py  --dataset ilab_20M_custom --dataset_path YOUR_LOCAL_PATH_OF_CUSTOM_DATASET --viz_name ilab_20M_custom

Citation

If you use this code for your research, please cite our papers.

@inproceedings{ge2021zeroshot,
  title={Zero-shot Synthesis with Group-Supervised Learning},
  author={Yunhao Ge and Sami Abu-El-Haija and Gan Xin and Laurent Itti},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=8wqCDnBmnrT}
}

Acknowledgments

Our code is inspired by Beta-VAE.

Owner
Andy_Ge
Ph.D. Student in Computer Vision, Machine Learning, and Baby Learning
Andy_Ge
:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

R²SQL The PyTorch implementation of paper Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing. (AAAI 2021) Requirement

huybery 60 Dec 31, 2022
PyTorch and GPyTorch implementation of the paper "Conditioning Sparse Variational Gaussian Processes for Online Decision-making."

Conditioning Sparse Variational Gaussian Processes for Online Decision-making This repository contains a PyTorch and GPyTorch implementation of the pa

Wesley Maddox 16 Dec 08, 2022
The Simplest DCGAN Implementation

DCGAN in TensorLayer This is the TensorLayer implementation of Deep Convolutional Generative Adversarial Networks. Looking for Text to Image Synthesis

TensorLayer Community 310 Dec 13, 2022
MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

pytorch-made This code is an implementation of "Masked AutoEncoder for Density Estimation" by Germain et al., 2015. The core idea is that you can turn

Andrej 498 Dec 30, 2022
Code for "Causal autoregressive flows" - AISTATS, 2021

Code for "Causal Autoregressive Flow" This repository contains code to run and reproduce experiments presented in Causal Autoregressive Flows, present

Ricardo Pio Monti 35 Dec 16, 2022
Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

NAVER/LINE Vision 30 Dec 06, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 29, 2022
Heterogeneous Temporal Graph Neural Network

Heterogeneous Temporal Graph Neural Network This repository contains the datasets and source code of HTGNN. run_mag.ipynb is the training and testing

15 Dec 22, 2022
This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch

This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch. The code was prepared to the final version of the accepted manuscript in AIST

Marcelo Hartmann 2 May 06, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 05, 2022
Chatbot in 200 lines of code using TensorLayer

Seq2Seq Chatbot This is a 200 lines implementation of Twitter/Cornell-Movie Chatbot, please read the following references before you read the code: Pr

TensorLayer Community 820 Dec 17, 2022
Neural Koopman Lyapunov Control

Neural-Koopman-Lyapunov-Control Code for our paper: Neural Koopman Lyapunov Control Requirements dReal4: v4.19.02.1 PyTorch: 1.2.0 The learning framew

Vrushabh Zinage 6 Dec 24, 2022
PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Spatially Consistent Representation Learning (CVPR'21) Official PyTorch implementation of Spatially Consistent Representation Learning (SCRL). This re

Kakao Brain 102 Nov 03, 2022
torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

💎A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily instal

DefTruth 142 Dec 25, 2022
Official code base for the poster "On the use of Cortical Magnification and Saccades as Biological Proxies for Data Augmentation" published in NeurIPS 2021 Workshop (SVRHM)

Self-Supervised Learning (SimCLR) with Biological Plausible Image Augmentations Official code base for the poster "On the use of Cortical Magnificatio

Binxu 8 Aug 17, 2022
Reference models and tools for Cloud TPUs.

Cloud TPUs This repository is a collection of reference models and tools used with Cloud TPUs. The fastest way to get started training a model on a Cl

5k Jan 05, 2023
[SIGMETRICS 2022] One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search paper | website One Proxy Device Is Enough for Hardware-Aware Neural Architec

10 Dec 16, 2022
Code for the paper "Adversarial Generator-Encoder Networks"

This repository contains code for the paper "Adversarial Generator-Encoder Networks" (AAAI'18) by Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. Pr

Dmitry Ulyanov 279 Jun 26, 2022
Public implementation of the Convolutional Motif Kernel Network (CMKN) architecture

CMKN Implementation of the convolutional motif kernel network (CMKN) introduced in Ditz et al., "Convolutional Motif Kernel Network", 2021. Testing Yo

1 Nov 17, 2021
[AAAI-2022] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning

Mutual Contrastive Learning for Visual Representation Learning This project provides source code for our Mutual Contrastive Learning for Visual Repres

winycg 48 Jan 02, 2023