3D Generative Adversarial Network

Overview

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

This repository contains pre-trained models and sampling code for the 3D Generative Adversarial Network (3D-GAN) presented at NIPS 2016.

http://3dgan.csail.mit.edu

Prerequisites

Torch

We use Torch 7 (http://torch.ch) for our implementation with these additional packages:

Visualization

  • Basic visualization: MATLAB (tested on R2016b)
  • Advanced visualization: Python 2.7 with package numpy, matplotlib, scipy and vtk (version 5.10.1)

Note: for advanced visualization, the version of vtk has to be 5.10.1, not above. It is available in the package list of common Python distributions like Anaconda

Installation

Our current release has been tested on Ubuntu 14.04.

Cloning the repository

git clone [email protected]:zck119/3dgan-release.git
cd 3dgan-release

Downloading pretrained models

For CPU (947 MB):

./download_models_cpu.sh

For GPU (618 MB):

./download_models_gpu.sh

Downloading latent vector inputs for demo

./download_demo_inputs.sh

Guide

Synthesizing shapes (main.lua)

We show how to synthesize shapes with our pre-trained models. The file (main.lua) has the following options.

  • -gpu ID: GPU ID (starting from 1). Set to 0 to use CPU only.
  • -class CLASSNAME: synthesize shapes for the class CLASSNAME. We currently support five classes (car, chair, desk, gun, and sofa). Use all to generate shapes for each class.
  • -sample: whether to sample input latent vectors from an i.i.d. uniform distribution, or to generate shapes with demo vectors loaded from ./demo_inputs/CLASSNAME.mat
  • -bs BATCH_SIZE: use batch size of BATCH_SIZE during network forwarding
  • -ss SAMPLE_SIZE: set the number of generated shapes to SAMPLE_SIZE. This option is only available in -sample mode.

Usages include

  • Synthesize chairs with pre-sampled demo inputs and a CPU
th main.lua -gpu 0 -class chair 
  • Randomly sample 150 desks with GPU 1 and a batch size of 50
th main.lua -gpu 1 -class desk -bs 50 -sample -ss 150 
  • Randomly sample 150 shapes of each category with GPU 1 and a batch size of 50
th main.lua -gpu 1 -class all -bs 50 -sample -ss 150 

The output is saved under folder ./output, with class_name_demo.mat for shapes generated by predetermined demo inputs (z in our paper), and class_name_sample.mat for randomly sampled 3D shapes. The variable inputs in the .mat file correponds to the input latent representation, and the variable voxels corresponds to the generated 3D shapes by our network.

Visualization

We offer two ways of visualizing results, one in MATLAB and the other in Python. We used the Python visualization in our paper. The MATLAB visualization is easier to install and run, but its output has a lower quality compared with the Python one.

MATLAB: Please use the function visualization/matlab/visualize.m for visualization. The MATLAB code allows users to either display rendered objects or save them as images. The script also supports downsampling and thresholding for faster rendering. The color of voxels represents the confidence value.

Options include

  • inputfile: the .mat file that saves the voxel matrices
  • indices: the indices of objects in the inputfile that should be rendered. The default value is 0, which stands for rendering all objects.
  • step (s): downsample objects via a max pooling of step s for efficiency. The default value is 4 (64 x 64 x 64 -> 16 x 16 x 16).
  • threshold: voxels with confidence lower than the threshold are not displayed
  • outputprefix:
    • when not specified, Matlab shows figures directly.
    • when specified, Matlab stores rendered images of objects at outputprefix_%i.bmp, where %i is the index of objects

Usage (after running th main.lua -gpu 0 -class chair, in MATLAB, in folder visualization/matlab):

visualize('../../output/chair_demo.mat', 0, 2, 0.1, 'chair')

The visualization might take a while. The obtained rendering (chair_1/3/4/5.bmp) should look as follows.

Python: Options for the Python visualization include

  • -t THRESHOLD: voxels with confidence lower than the threshold are not displayed. The default value is 0.1.
  • -i ID: the index of objects in the inputfile that should be rendered (one based). The default value is 1.
  • -df STEPSIZE: downsample objects via a max pooling of step STEPSIZE for efficiency. Currently supporting STEPSIZE 1, 2, and 4. The default value is 1 (i.e. no downsampling).
  • -dm METHOD: downsample method, where mean stands for average pooling and max for max pooling. The default is max pooling.
  • -u BLOCK_SIZE: set the size of the voxels to BLOCK_SIZE. The default value is 0.9.
  • -cm: whether to use a colormap to represent voxel occupancy, or to use a uniform color
  • -mc DISTANCE: whether to keep only the maximal connected component, where voxels of distance no larger than DISTANCE are considered connected. Set to 0 to disable this function. The default value is 3.

Usage:

python visualize.py chair_demo.mat -u 0.9 -t 0.1 -i 1 -mc 2

Reference

@inproceedings{3dgan,
  title={{Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling}},
  author={Wu, Jiajun and Zhang, Chengkai and Xue, Tianfan and Freeman, William T and Tenenbaum, Joshua B},
  booktitle={Advances In Neural Information Processing Systems},
  pages={82--90},
  year={2016}
}

For any questions, please contact Jiajun Wu ([email protected]) and Chengkai Zhang ([email protected]).

A python library for implementing a recommender system

python-recsys A python library for implementing a recommender system. Installation Dependencies python-recsys is build on top of Divisi2, with csc-pys

Oscar Celma 1.5k Dec 17, 2022
PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

MNIST-to-SVHN and SVHN-to-MNIST PyTorch Implementation of CycleGAN and Semi-Supervised GAN for Domain Transfer. Prerequites Python 3.5 PyTorch 0.1.12

Yunjey Choi 401 Dec 30, 2022
3D Human Pose Machines with Self-supervised Learning

3D Human Pose Machines with Self-supervised Learning Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, and Pengxu Wei, “3D Human Pose Machines with Self

Chenhan Jiang 398 Dec 20, 2022
Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)

Video Corpus Moment Retrieval with Contrastive Learning PyTorch implementation for the paper "Video Corpus Moment Retrieval with Contrastive Learning"

ZHANG HAO 42 Dec 29, 2022
Investigating Attention Mechanism in 3D Point Cloud Object Detection (arXiv 2021)

Investigating Attention Mechanism in 3D Point Cloud Object Detection (arXiv 2021) This repository is for the following paper: "Investigating Attention

52 Nov 19, 2022
Source code for Zalo AI 2021 submission

zalo_ltr_2021 Source code for Zalo AI 2021 submission Solution: Pipeline We use the pipepline in the picture below: Our pipeline is combination of BM2

128 Dec 27, 2022
Adversarial Learning for Semi-supervised Semantic Segmentation, BMVC 2018

Adversarial Learning for Semi-supervised Semantic Segmentation This repo is the pytorch implementation of the following paper: Adversarial Learning fo

Wayne Hung 464 Dec 19, 2022
End-to-end beat and downbeat tracking in the time domain.

WaveBeat End-to-end beat and downbeat tracking in the time domain. | Paper | Code | Video | Slides | Setup First clone the repo. git clone https://git

Christian J. Steinmetz 60 Dec 24, 2022
Python3 Implementation of (Subspace Constrained) Mean Shift Algorithm in Euclidean and Directional Product Spaces

(Subspace Constrained) Mean Shift Algorithms in Euclidean and/or Directional Product Spaces This repository contains Python3 code for the mean shift a

Yikun Zhang 0 Oct 19, 2021
Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021) Alexey Nekrasov*, Jonas Schult*, Or Litany, Bastian Leibe, Francis Engelmann Mix3D is

Alexey Nekrasov 189 Dec 26, 2022
Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

AgentFormer This repo contains the official implementation of our paper: AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecast

Ye Yuan 161 Dec 23, 2022
Machine Learning University: Accelerated Computer Vision Class

Machine Learning University: Accelerated Computer Vision Class This repository contains slides, notebooks, and datasets for the Machine Learning Unive

AWS Samples 1.3k Dec 28, 2022
A package related to building quasi-fibration symmetries

qf A package related to building quasi-fibration symmetries. If you'd like to learn more about how it works, see the brief explanation and References

Paolo Boldi 1 Dec 01, 2021
MaRS - a recursive filtering framework that allows for truly modular multi-sensor integration

The Modular and Robust State-Estimation Framework, or short, MaRS, is a recursive filtering framework that allows for truly modular multi-sensor integration

Control of Networked Systems - University of Klagenfurt 143 Dec 29, 2022
Use tensorflow to implement a Deep Neural Network for real time lane detection

LaneNet-Lane-Detection Use tensorflow to implement a Deep Neural Network for real time lane detection mainly based on the IEEE IV conference paper "To

MaybeShewill-CV 1.9k Jan 08, 2023
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

Tae-Hwan Jung 775 Jan 08, 2023
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023
Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

37 Dec 08, 2022
Resilience from Diversity: Population-based approach to harden models against adversarial attacks

Resilience from Diversity: Population-based approach to harden models against adversarial attacks Requirements To install requirements: pip install -r

0 Nov 23, 2021