Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

Overview

End-to-End Optimization of Scene Layout

Teaser Image Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral)

Project site, Bibtex

For help contact afluo [a.t] andrew.cmu.edu or open an issue

  • Requirements

    • Pytorch 1.2 (for everything)
    • Neural 3D Mesh Renderer - daniilidis version (for scene refinement only) For numerical stability, please modify projection.py to remove the multiplication by 0. After the change L33, L34 looks like:
    x__ = x_
    y__ = y_ 
    
    • Blender 2.79 (for 3D rendering of rooms only)
      • Please install numpy in Blender
    • matplotlib
    • numpy
    • skimage (for SPADE based shading)
    • imageio (for SPADE based shading)
    • shapely (eval only)
    • PyWavefront (for scene refinement only, loading of 3d meshes)
    • PyMesh (for scene refnement only, remeshing of SUNCG objects)
    • 1 Nvidia GPU

Download checkpoints here, download metadata here

Project structure
|-3d_SLN
  |-data
    |-suncg_dataset.py
      # Actual definition for the dataset object, makes batches of scene graphs
  |-metadata
    # SUNCG meta data goes here
    |-30_size_info_many.json
      # data about object size/volume, for 30/70 cutoff
    |-data_rot_train.json
      # Normalized object positions & rotations for training
    |-data_rot_val.json
      # For testing
    |-size_info_many.json
      # data about object size/volume, different cutoff
    |-valid_types.json
      # What object types we should use for making the scene graph
      # Caution when editing this, quite a bit is hard coded elsewhere
  |-models
    |-diff_render.py
      # Uses the Neural Mesh Renderer (Pytorch Version) to refine object positions
    |-graph.py
      # Graph network building blocks
    |-misc.py
      # Misc helper functions for the diff renderer
    |-Sg2ScVAE_model.py
      # Code to construct the VAE-graph network
    |-SPADE_related.py
      # Tools to construct SPADE VAE GAN (inference only)
  |-options
    # Global options
  |-render
    # Contains various "profiles" for Blender rendering
  |-testing
    # You must call batch_gen in test.py at least once
    # It will call into get_layouts_from_network in test_VAE.py
    # this will compute the posterior mean & std and cache it
    |-test_acc_mean_std.py
      # Contains helper functions to measure acc/l1/std 
    |-test_heatmap.py
      # Contains the functions *produce_heatmap* and *plot_heatmap*
      # The first function takes as input a verbally defined scene graph
        # If not provided, it uses a default scene graph with 5 objects
        # It will load weights for a VAE-graph network
        # Then load the computed posterior mean & std
        # And repeatedly sample from the given scene graph
        # Saves the results to a .pkl file
      # The second function will load a .pkl and plot them as heatmaps
    |-test_plot2d.py
      # Contains a function that uses matplotlib
      # Does NOT require SUNCG
      # Plots the objects using colors provided by ScanNet
    |-test_plot3d.py
      # Calls into the blender code in the ../render folder
      # Requires the SUNCG meshes
      # Requires Blender 2.79
      # Either uses the CPU (Blender renderer)
      # Or uses the GPU (Cycles renderer)
      # Loads a HDR texture (from HDRI Haven) for background
    |-test_SPADE_shade.py
      # Loads semantic maps & depth map, and produces RGB images using SPADE
    |-test_utils.py
      # Contains helper functions for testing
        # Of interest is the *get_sg_from_words* function
    |-test_VAE.py
  |-build_dataset_model.py
     # Constructs dataset & dataloader objects
     # Also constructs the VAE-graph network
  |-test.py
     # Provides functions which performs the following:
       # generation of layouts from scene graphs under the *batch_gen* argument
       # measure the accuracy of l1 loss, accuracy, std under the *measure_acc_l1_std* argument
       # draw the topdown heatmaps of layouts with a single scene graph under the *heat_map* argument
       # plot the topdown boxes of layouts with under the *draw_2d* argument
       # plot the viewer centric layouts using suncg meshes under the *draw_3d* argument
       # perform SPADE based shading of semantic+depth maps under the *gan_shade* argument
  |-train.py
     # Contains the training loop for the VAE-graph network
  |-utils.py
     # Contains various helper functions for:
       # managing network losses
       # make scene graphs from bounding boxes
       # load/write jsons
       # misc other stuff
  • Training the VAE-graph network (limited to 1 GPU):
    python train.py

  • Testing the VAE-graph network:
    First run python test.py --batch_gen at least once. This computes and caches a posterior for future sampling using the training set. It also generates a bunch of layouts using the test set.

  • To generate a heatmap:
    python test.py --heat_map
    You can either define your own scene graph (see the produce_heatmap function in testing/test_heatmap.py), if you do not provide one it will use the default one. The function will convert scene graphs defined using words into a format usable by the network.

  • To compute STD/L1/Acc:
    python test.py --measure_acc_l1_std

  • To plot the scene from a top down view with ScanNet colors (doesn't requrie SUNCG):
    python test.py --draw_2d
    Please provide a (O+1 x 6) tensor of bounding boxes, and a (O+1,) tensor of rotations. The last object should be the bounding box of the room

  • To plot 3D
    python test.py --draw_3d
    This calls into test_plot3d.py, which in turn launched Blender, and executes render_caller.py, you can put in specific rooms by editing this file. The full rendering function is located in render_room_color.py.

  • To use a neural renderer to refine a room
    python test.py --fine_tune Please select the indexes of the room in test.py. This will call into test_render_refine.py which uses the differentiable renderer located in diff_render.py. Learning rate, and loss types/weightings can be set in test_render_refine.py.
    We set a manual seed for demonstration purposes, in practice please remove this.

  • To use SPADE to generate texture/shading/lighting for a room from semantic + depth
    python test.py --gan_shade This will first call into semantic_depth_caller.py to produce the semantic and depth maps, then use SPADE to generate RGB images.

Citation

If you find this repo useful for your research, please consider citing the paper

@inproceedings{luo2020end,
  title={End-to-End Optimization of Scene Layout},
  author={Luo, Andrew and Zhang, Zhoutong and Wu, Jiajun and Tenenbaum, Joshua B},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={3754--3763},
  year={2020}
}
Owner
Andrew Luo
PhD student @ CMU
Andrew Luo
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
Implementation of the Chamfer Distance as a module for pyTorch

Chamfer Distance for pyTorch This is an implementation of the Chamfer Distance as a module for pyTorch. It is written as a custom C++/CUDA extension.

Christian Diller 205 Jan 05, 2023
DuBE: Duple-balanced Ensemble Learning from Skewed Data

DuBE: Duple-balanced Ensemble Learning from Skewed Data "Towards Inter-class and Intra-class Imbalance in Class-imbalanced Learning" (IEEE ICDE 2022 S

6 Nov 12, 2022
A simple Rock-Paper-Scissors game using CV in python

ML18_Rock-Paper-Scissors-using-CV A simple Rock-Paper-Scissors game using CV in python For IITISOC-21 Rules and procedure to play the interactive game

Anirudha Bhagwat 3 Aug 08, 2021
[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime

[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime

CC 4.4k Dec 27, 2022
Codes and pretrained weights for winning submission of 2021 Brain Tumor Segmentation (BraTS) Challenge

Winning submission to the 2021 Brain Tumor Segmentation Challenge This repo contains the codes and pretrained weights for the winning submission to th

94 Dec 28, 2022
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings

Yan Song 55 Dec 15, 2022
Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan

SIV-LAB 22 Aug 31, 2022
Quantized tflite models for ailia TFLite Runtime

ailia-models-tflite Quantized tflite models for ailia TFLite Runtime About ailia TFLite Runtime ailia TF Lite Runtime is a TensorFlow Lite compatible

ax Inc. 13 Dec 23, 2022
TiP-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling

TiP-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling This is the official code release for the paper 'TiP-Adapter: Training-fre

peng gao 189 Jan 04, 2023
The author's officially unofficial PyTorch BigGAN implementation.

BigGAN-PyTorch The author's officially unofficial PyTorch BigGAN implementation. This repo contains code for 4-8 GPU training of BigGANs from Large Sc

Andy Brock 2.6k Jan 02, 2023
A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

Graph Wavelet Neural Network ⠀⠀ A PyTorch implementation of Graph Wavelet Neural Network (ICLR 2019). Abstract We present graph wavelet neural network

Benedek Rozemberczki 490 Dec 16, 2022
PyTorch wrappers for using your model in audacity!

audacitorch This package contains utilities for prepping PyTorch audio models for use in Audacity. More specifically, it provides abstract classes for

Hugo Flores García 130 Dec 14, 2022
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement

USAF - MIT Artificial Intelligence Accelerator 46 Dec 15, 2022
A short and easy PyTorch implementation of E(n) Equivariant Graph Neural Networks

Simple implementation of Equivariant GNN A short implementation of E(n) Equivariant Graph Neural Networks for HOMO energy prediction. Just 50 lines of

Arsenii Senya Ashukha 97 Dec 23, 2022
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.

[CVPR2022] Thin-Plate Spline Motion Model for Image Animation Source code of the CVPR'2022 paper "Thin-Plate Spline Motion Model for Image Animation"

yoyo-nb 1.4k Dec 30, 2022
Unofficial Implementation of Oboe (SIGCOMM'18').

Oboe-Reproduce This is the unofficial implementation of the paper "Oboe: Auto-tuning video ABR algorithms to network conditions, Zahaib Akhtar, Yun Se

Tianchi Huang 13 Nov 04, 2022
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

43 Nov 21, 2022
we propose a novel deep network, named feature aggregation and refinement network (FARNet), for the automatic detection of anatomical landmarks.

Feature Aggregation and Refinement Network for 2D Anatomical Landmark Detection Overview Localization of anatomical landmarks is essential for clinica

aoyueyuan 0 Aug 28, 2022
Real-time Joint Semantic Reasoning for Autonomous Driving

MultiNet MultiNet is able to jointly perform road segmentation, car detection and street classification. The model achieves real-time speed and state-

Marvin Teichmann 518 Dec 12, 2022