This is an official implementation for "PlaneRecNet".

Overview

PlaneRecNet

This is an official implementation for PlaneRecNet: A multi-task convolutional neural network provides instance segmentation for piece-wise planes and monocular depth estimation, and focus on the cross-task consistency between two branches. Network Architecture

Changing Logs

22th. Oct. 2021: Initial update, some trained models and data annotation will be uploaded very soon.

29th. Oct. 2021: Upload ResNet-50 based model.

3rd. Nov. 2021: Nice to know that "prn" or "PRN" is a forbiden name in Windows.

4th. Nov. 2021: For inference, input image will be resized to max(H, W) == cfg.max_size, and reserve the aspect ratio. Update enviroment.yml, so that newest GPU can run it as well.

Installation

Install environment:

  • Clone this repository and enter it:
git clone https://github.com/EryiXie/PlaneRecNet.git
cd PlaneRecNet
  • Set up the environment using one of the following methods:
    • Using Anaconda
      • Run conda env create -f environment.yml
    • Using Docker
      • dockerfile will come later...

Download trained model:

Here are our models (released on Oct 22th, 2021), which can reproduce the results in the paper:

Quantitative Results

All models below are trained with batch_size=8 and a single RTX3090 or a single RTXA6000 on the plane annotation for ScanNet dataset:

Image Size Backbone FPS Weights
480x640 Resnet50-DCN 19.1 PlaneRecNet_50
480x640 Resnet101-DCN 14.4 PlaneRecNet_101

Simple Inference

Inference with an single image(*.jpg or *.png format):

python3 simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth  --image=data/example_nyu.jpg

Inference with images in a folder:

python3 simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth --images=input_folder:output_folder

Inference with .mat files from iBims-1 Dataset:

python3 simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth --ibims1=input_folder:output_folder

Then you will get segmentation and depth estimation results like these:

Qualititative Results

Training

PlaneRecNet is trained on ScanNet with 100k samples on one single RTX 3090 with batch_size=8, it takes approximate 37 hours. Here are the data annotations(about 1.0 GB) for training of ScanNet datasets, which is based on the annotation given by PlaneRCNN and converted into json file. Please not that, our training sample is not same as PlaneRCNN, because we don't have their training split at hand.

Please notice, the pathing and naming rules in our data/dataset.py, is not compatable with the raw data extracted with the ScanNetv2 original code. Please refer to this issue for fixing tips, thanks uyoung-jeong for that. I will add the data preprocessing script to fix this, once I have time.

Of course, please download ScanNet too for rgb image, depth image and camera intrinsic etc.. The annotation file we provide only contains paths for images and camera intrinsic and the ground truth of piece-wise plane instance and its plane parameters.

  • To train, grab an imagenet-pretrained model and put it in ./weights.
    • For Resnet101, download resnet101_reducedfc.pth from here.
    • For Resnet50, download resnet50-19c8e357.pth from here.
  • Run one of the training commands below.
    • Press ctrl+c while training and it will save an *_interrupt.pth file at the current iteration.
    • All weights are saved in the ./weights directory by default with the file name <config>_<epoch>_<iter>.pth.

Trains PlaneRecNet_101_config with a batch_size of 8.

python3 train.py --config=PlaneRecNet_101_config --batch_size=8

Trains PlaneRecNet, without writing any logs to tensorboard.

python3 train.py --config=PlaneRecNet_101_config --batch_size=8 --no_tensorboard

Run Tensorboard on local dir "./logs" to check the visualization. So far we provide loss recording and image sample visualization, may consider to add more (22.Oct.2021).

tenosrborad --logdir /log/folder/

Resume training PlaneRecNet with a specific weight file and start from the iteration specified in the weight file's name.

python3 train.py --config=PlaneRecNet_101_config --resume=weights/PlaneRecNet_101_X_XXXXX.pth

Use the help option to see a description of all available command line arguments.

python3 train.py --help

Multi-GPU Support

We adapted the Multi-GPU support from YOLACT, as well as the introduction of how to use it as follow:

  • Put CUDA_VISIBLE_DEVICES=[gpus] on the beginning of the training command.
    • Where you should replace [gpus] with a comma separated list of the index of each GPU you want to use (e.g., 0,1,2,3).
    • You should still do this if only using 1 GPU.
    • You can check the indices of your GPUs with nvidia-smi.
  • Then, simply set the batch size to 8*num_gpus with the training commands above. The training script will automatically scale the hyperparameters to the right values.
    • If you have memory to spare you can increase the batch size further, but keep it a multiple of the number of GPUs you're using.
    • If you want to allocate the images per GPU specific for different GPUs, you can use --batch_alloc=[alloc] where [alloc] is a comma seprated list containing the number of images on each GPU. This must sum to batch_size.

Known Issues

  1. Userwarning of torch.max_pool2d. This has no real affect. It appears when using PyTorch 1.9. And it is claimed "fixed" for the nightly version of PyTorch.
UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
  1. Userwarning of leaking Caffe2 while training. This issues related to dataloader in PyTorch1.9, to avoid showing this warning, set pin_memory=False for dataloader. But you don't necessarily need to do this.
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)

Citation

If you use PlaneRecNet or this code base in your work, please cite

@misc{xie2021planerecnet,
      title={PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image}, 
      author={Yaxu Xie and Fangwen Shu and Jason Rambach and Alain Pagani and Didier Stricker},
      year={2021},
      eprint={2110.11219},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

For questions about our paper or code, please contact Yaxu Xie, or take a good use at the Issues section of this repository.

Owner
yaxu
Oh, hamburgers!
yaxu
Style transfer, deep learning, feature transform

FastPhotoStyle License Copyright (C) 2018 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons

NVIDIA Corporation 10.9k Jan 02, 2023
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

Note: This is an alpha (preview) version which is still under refining. nn-Meter is a novel and efficient system to accurately predict the inference l

Microsoft 244 Jan 06, 2023
Supplementary code for the AISTATS 2021 paper "Matern Gaussian Processes on Graphs".

Matern Gaussian Processes on Graphs This repo provides an extension for gpflow with Matérn kernels, inducing variables and trainable models implemente

41 Dec 17, 2022
MoCoGAN: Decomposing Motion and Content for Video Generation

MoCoGAN: Decomposing Motion and Content for Video Generation This repository contains an implementation and further details of MoCoGAN: Decomposing Mo

Sergey Tulyakov 514 Dec 18, 2022
A PyTorch Implementation of FaceBoxes

FaceBoxes in PyTorch By Zisian Wong, Shifeng Zhang A PyTorch implementation of FaceBoxes: A CPU Real-time Face Detector with High Accuracy. The offici

Zi Sian Wong 797 Dec 17, 2022
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

GLOM - Pytorch (wip) An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding,

Phil Wang 173 Dec 14, 2022
Styled Augmented Translation

SAT Style Augmented Translation Introduction By collecting high-quality data, we were able to train a model that outperforms Google Translate on 6 dif

139 Dec 29, 2022
Python utility to generate filesystem content for Obsidian.

Security Vault Generator Quickly parse, format, and output common frameworks/content for Obsidian.md. There is a strong focus on MITRE ATT&CK because

Justin Angel 73 Dec 02, 2022
StarGAN v2 - Official PyTorch Implementation (CVPR 2020)

StarGAN v2 - Official PyTorch Implementation StarGAN v2: Diverse Image Synthesis for Multiple Domains Yunjey Choi*, Youngjung Uh*, Jaejun Yoo*, Jung-W

Clova AI Research 3.1k Jan 09, 2023
MQBench Quantization Aware Training with PyTorch

MQBench Quantization Aware Training with PyTorch I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for depl

Ling Zhang 29 Nov 18, 2022
Answering Open-Domain Questions of Varying Reasoning Steps from Text

This repository contains the authors' implementation of the Iterative Retriever, Reader, and Reranker (IRRR) model in the EMNLP 2021 paper "Answering Open-Domain Questions of Varying Reasoning Steps

26 Dec 22, 2022
Transfer Learning for Pose Estimation of Illustrated Characters

bizarre-pose-estimator Transfer Learning for Pose Estimation of Illustrated Characters Shuhong Chen *, Matthias Zwicker * WACV2022 [arxiv] [video] [po

Shuhong Chen 142 Dec 28, 2022
Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays

Description Summary of phylogenomic methods and analyses used in "Immunogenicity of convalescent and vaccinated sera against clinical isolates of ance

Finlay Maguire 1 Jan 06, 2022
Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Introduction This is a PyTorch implementation of the following research papers: (1) Hierarchical Text Generation and Planning for Strategic Dialogue (

Facebook Research 1.4k Dec 29, 2022
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for

Zekeriyya Demirci 1 Jan 09, 2022
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings

Yan Song 55 Dec 15, 2022
Text completion with Hugging Face and TensorFlow.js running on Node.js

Katana ML Text Completion 🤗 Description Runs with with Hugging Face DistilBERT and TensorFlow.js on Node.js distilbert-model - converter from Hugging

Katana ML 2 Nov 04, 2022
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks (SDPoint) This repository contains the cod

Jason Kuen 17 Jul 04, 2022
Satellite labelling tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, rings etc.

Satellite labelling tool About this app A tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, ri

Czech Hydrometeorological Institute - Satellite Department 10 Sep 14, 2022
The undersampled DWI image using Slice-Interleaved Diffusion Encoding (SIDE) method can be reconstructed by the UNet network.

UNet-SIDE The undersampled DWI image using Slice-Interleaved Diffusion Encoding (SIDE) method can be reconstructed by the UNet network. For Super Reso

TIANTIAN XU 1 Jan 13, 2022