Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Overview

Pop-Out Motion

Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Jihyun Lee*, Minhyuk Sung*, Hyunjin Kim, Tae-Kyun (T-K) Kim (*: equal contributions)

[Project Page] [Paper] [Video]

animated

We present a framework that can deform an object in a 2D image as it exists in 3D space. While our method leverages 2D-to-3D reconstruction, we argue that reconstruction is not sufficient for realistic deformations due to the vulnerability to topological errors. Thus, we propose to take a supervised learning-based approach to predict the shape Laplacian of the underlying volume of a 3D reconstruction represented as a point cloud. Given the deformation energy calculated using the predicted shape Laplacian and user-defined deformation handles (e.g., keypoints), we obtain bounded biharmonic weights to model plausible handle-based image deformation.

 

Environment Setup

Clone this repository and install the dependencies specified in requirements.txt.

 git clone https://github.com/jyunlee/Pop-Out-Motion.git
 mv Pop-Out-Motion
 pip install -r requirements.txt 

 

Data Pre-Processing

Training Data

  1. Build executables from the c++ files in data_preprocessing directory. After running the commands below, you should have normalize_bin and calc_l_minv_bin executables.
 cd data_preprocessing
 mkdir build
 cd build
 cmake ..
 make
 cd ..
  1. Clone and build Manifold repository to obtain manifold executable.

  2. Clone and build fTetWild repository to obtain FloatTetwild_bin executable.

  3. Run preprocess_train_data.py to prepare your training data. This should perform (1) shape normalization into a unit bounding sphere, (2) volume mesh conversion, and (3) cotangent Laplacian and inverse mass calculation.

 python preprocess_train_data.py 

Test Data

  1. Build executables from the c++ files in data_preprocessing directory. After running the commands below, you should have normalize_bin executable.
 cd data_preprocessing
 mkdir build
 cd build
 cmake ..
 make
 cd ..
  1. Run preprocess_test_data.py to prepare your test data. This should perform (1) shape normalization into a unit bounding sphere and (2) pre-computation of KNN-Based Point Pair Sampling (KPS).
 python preprocess_test_data.py 

 

Network Training

Run network/train.py to train your own Laplacian Learning Network.

 cd network
 python train.py 

The pre-trained model on DFAUST dataset is also available here.

 

Network Inference

Deformation Energy Inference

  1. Given an input image, generate its 3D reconstruction via running PIFu. It is also possible to directly use point cloud data obtained from other sources.

  2. Pre-process the data obtained from Step 1 -- please refer to this section.

  3. Run network/a_inference.py to predict the deformation energy matrix.

 cd network
 python a_inference.py 

Handle-Based Deformation Weight Calculation

  1. Build an executable from the c++ file in bbw_calculation directory. After running the commands below, you should have calc_bbw_bin executable.
 cd bbw_calculation
 mkdir build
 cd build
 cmake ..
 make
 cd ..
  1. (Optional) Run sample_pt_handles.py to obtain deformation control handles sampled by farthest point sampling.

  2. Run calc_bbw_bin to calculate handle-based deformation weights using the predicted deformation energy.

./build/calc_bbw_bin <shape_path> <handle_path> <deformation_energy_path> <output_weight_path>

 

Citation

If you find this work useful, please consider citing our paper.

@InProceedings{lee2022popoutmotion,
    author = {Lee, Jihyun and Sung, Minhyuk and Kim, Hyunjin and Kim, Tae-Kyun},
    title = {Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2022}
}

 

Acknowledgements

Owner
Jihyun Lee
Jihyun Lee
Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection This material is supplementray code for paper accepted in ICDAR 2021 We h

NCSOFT 30 Dec 21, 2022
a dnn ai project to classify which food people are eating on audio recordings

Deep Learning - EAT Challenge About This project is part of an AI challenge of the DeepLearning course 2021 at the University of Augsburg. The objecti

Marco Tröster 1 Oct 24, 2021
Datasets, tools, and benchmarks for representation learning of code.

The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided

GitHub 1.8k Dec 25, 2022
This is a collection of our NAS and Vision Transformer work.

AutoML - Neural Architecture Search This is a collection of our AutoML-NAS work iRPE (NEW): Rethinking and Improving Relative Position Encoding for Vi

Microsoft 832 Jan 08, 2023
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
LegoDNN: a block-grained scaling tool for mobile vision systems

Table of contents 1 Introduction 1.1 Major features 1.2 Architecture 2 Code and Installation 2.1 Code 2.2 Installation 3 Repository of DNNs in vision

41 Dec 24, 2022
A PyTorch Library for Accelerating 3D Deep Learning Research

Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research Overview NVIDIA Kaolin library provides a PyTorch API for working with a variety

NVIDIA GameWorks 3.5k Jan 07, 2023
PyTorch implementation of MulMON

MulMON This repository contains a PyTorch implementation of the paper: Learning Object-Centric Representations of Multi-object Scenes from Multiple Vi

NanboLi 16 Nov 03, 2022
Code I use to automatically update my videos' metadata on YouTube

mCodingYouTube This repository contains the code I use to automatically update my videos' metadata on YouTube, including: titles, descriptions, tags,

James Murphy 19 Oct 07, 2022
[ECCV'20] Convolutional Occupancy Networks

Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page | Blog Post This repository contains the implementation o

622 Dec 30, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 235 Jan 03, 2023
A curated list of Machine Learning and Deep Learning tutorials in Jupyter Notebook format ready to run in Google Colaboratory

Awesome Machine Learning Jupyter Notebooks for Google Colaboratory A curated list of Machine Learning and Deep Learning tutorials in Jupyter Notebook

Carlos Toxtli 245 Jan 01, 2023
Toolchain to build Yoshi's Island from source code

Project-Y Toolchain to build Yoshi's Island (J) V1.0 from source code, by MrL314 Last updated: September 17, 2021 Setup To begin, download this toolch

MrL314 19 Apr 18, 2022
Code for "Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods."

pv_predict_unet-lstm Code for "Intra-hour Photovoltaic Generation Forecasting based on Multi-source Data and Deep Learning Methods." IEEE Transactions

FolkScientistInDL 8 Oct 08, 2022
Implementation of PersonaGPT Dialog Model

PersonaGPT An open-domain conversational agent with many personalities PersonaGPT is an open-domain conversational agent cpable of decoding personaliz

ILLIDAN Lab 42 Jan 01, 2023
unet-family: Ultimate version

unet-family: Ultimate version 基于之前my-unet代码,我整理出来了这一份终极版本unet-family,方便其他人阅读。 相比于之前的my-unet代码,代码分类更加规范,有条理 对于clone下来的代码不需要修改各种复杂繁琐的路径问题,直接就可以运行。 并且代码有

2 Sep 19, 2022
XViT - Space-time Mixing Attention for Video Transformer

XViT - Space-time Mixing Attention for Video Transformer This is the official implementation of the XViT paper: @inproceedings{bulat2021space, title

Adrian Bulat 33 Dec 23, 2022
FFCV: Fast Forward Computer Vision (and other ML workloads!)

Fast Forward Computer Vision: train models at a fraction of the cost with accele

FFCV 2.3k Jan 03, 2023
Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

LUPerson-NL Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL) The repository is for our CVPR2022 paper Large-Scale

43 Dec 26, 2022
Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

Caffe SegNet This is a modified version of Caffe which supports the SegNet architecture As described in SegNet: A Deep Convolutional Encoder-Decoder A

Alex Kendall 1.1k Jan 02, 2023