Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

Related tags

Deep Learningsuo_slam
Overview

SUO-SLAM

This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv link.

Citation

If you use any part of this repository in an academic work, please cite our paper as:

@inproceedings{Merrill2022CVPR,
  Title      = {Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation},
  Author     = {Nathaniel Merrill and Yuliang Guo and Xingxing Zuo and Xinyu Huang and Stefan Leutenegger and Xi Peng and Liu Ren and Guoquan Huang},
  Booktitle  = {2022 Conference on Computer Vision and Pattern Recognition (CVPR)},
  Year       = {2022},
  Address    = {New Orleans, USA},
  Month      = jun,
}

Installation

Click for details... This codebase was tested on Ubuntu 18.04. To use the BOP rendering (i.e. for keypoint labeling) install
sudo apt install libfreetype6-dev libglfw3

You will also need a python environment that contains the required packages. To see what packages we used, check out the list of requirements in requirements.txt. They can be installed via pip install -r requirements.txt

Preparing Data

Click for details...

Datasets

To be able to run the training and testing (i.e. single view or with SLAM), first decide on a place to download the data to. The disk will need a few hundred GB of space for all the data (at least 150GB for download and more to extract). All of our code expects the data to be in a local directory ./data, but you can of course symlink this to another location (perhaps with more disk space). So, first of all, in the root of this repo run

$ mkdir data

or to symlink to an external location

$ ln -s /path/to/drive/with/space/ ./data

You can pick and choose what data you want to download (for example just T-LESS or YCBV). Note that all YCBV and TLESS downloads have our keypoint labels packaged along with the data. Download the following google drive links into ./data and extract them.

When all is said and done, the tree should look like this

$ cd ./data && tree --filelimit 3
.
├── bop_datasets
│   ├── tless 
│   └── ycbv 
├── saved_detections
└── VOCdevkit
    └── VOC2012

Pre-trained models

You can download the pretrained models anywhere, but I like to keep them in the results directory that is written to during training.

Training

Click for details...

First set the default arguments in ./lib/args.py for your username if desired, then execute

$ ./train.py

with the appropriate arguments for your filesystem. You can also run

$ ./train.py -h

for a full list of arguments and their meaning. Some important args are batch_size, which is the number of images loaded for each training batch. Note that there may be a variable number of objects in each image, and the objects are all stacked together into one big batch to run the network -- so the actual batch size being run might be multiple times batch_size. In order to keep batch_size reasonably large, we provide another arg called truncate_obj, which, as the help says, truncates the object batches to this number if it exceeds it. We recommend that you start with a large batch size so that you can find out the maximum truncate_obj for you GPUs, then reduce the batch size until there are little to no warnings about too many objects being truncated.

Evaluation

Click for details...

Before you can evaluate in a single-view or SLAM fashion, you will need to build the thirdparty libraries for PnP and graph optimization. First make sure that you have CERES solver installed. The run

$ ./build_thirdparty.sh

Reproducing Results

To reproduce the results of the paper with the pretrained models, check out the scripts under the scripts directory:

eval_all_tless.sh  eval_all_ycbv.sh  make_video.sh

These will reproduce most of the results in the paper as well as any video clips you want. You may have to change the first few lines of each script. Note that these examples can also show you the proper arguments if you want to run from command line alone.

Note that for the T-LESS dataset, we use the thirdparty BOP toolkit to get the VSD error recall, which will show up in the final terminal output as "Mean object recall" among other numbers.

Labeling

Click for details...

Overview

We manually label keypoints on the CAD model to enable some keypoints with semantic meaning. For the full list of keypoint meanings, see the specific README

We provide our landmark labeling tool. Check out the script manual_keypoints.py. This same script can be used to make a visualization of the keypoints as shown below with the --viz option.

The script will show a panel of the same object but oriented slightly differently. The idea is that you pick the same keypoint multiple times to ensure correctness and to get a better label by averaging multiple samples.

The script will also print the following directions to follow in the terminal.

============= Welcome ===============
Select the keypoints with a left click!
Use the "wasd" to turn the objects.
Press "i" to zoom in and "o" to zoom out.
Make sure that the keypoint colors match between all views.
Messed up? Just press 'u' to undo.
Press "Enter" to finish and save the keypoints
Press "Esc" to just quit

Once you have pressed "enter", you will get to an inspection pane.

Where the unscaled mean keypoints are on the left, and the ones scaled by covariance is on the left, where the ellipses are the Gaussian 3-sigma projected onto the image. If the covariance is too large, or the mean is out of place, then you may have messed up. Again, the program will print out these directions to terminal:

Inspect the results!
Use the "wasd" to turn the object.
Press "i" to zoom in and "o" to zoom out.
Press "Esc" to go back, "Enter" to accept (saving keypoints and viewpoint for vizualization).
Please pick a point on the object!

So if you are done, and the result looks good, then press "Enter", if not then "Esc" to go back. Make sure also that when you are done, you rotate and scale the object into the best "view pose" (with the front facing the camera, and top facing up), as this pose is used by both the above vizualization and the actual training code for determining the best symmetry to pick for an initial detection.

Labeling Tips

Even though there are 8 panels, you don't need to fill out all 8. Each keypoint just needs at least 3 samples to sample the covariance.

We recommend that you label the same keypoint (say keypoint i) on all the object renderings first, then go to the inspection panel at the end of this each time so that you can easily undo a mistake for keypoint i with the "u" key and not lose any work. Otherwise, if you label each object rendering completely, then you may have to undo a lot of labelings that were not mistakes.

Also, if there is an object that you want to label a void in the CAD model, like the top center of the bowl, then you can use the multiple samples to your advantage, and choose samples that will average to the desired result, since the labels are required to land on the actual CAD model in the labeling tool.

<\details>

Owner
Robot Perception & Navigation Group (RPNG)
Research on robot sensing, estimation, localization, mapping, perception, and planning
Robot Perception & Navigation Group (RPNG)
Contrastive Loss Gradient Attack (CLGA)

Contrastive Loss Gradient Attack (CLGA) Official implementation of Unsupervised Graph Poisoning Attack via Contrastive Loss Back-propagation, WWW22 Bu

12 Dec 23, 2022
Go from graph data to a secure and interactive visual graph app in 15 minutes. Batteries-included self-hosting of graph data apps with Streamlit, Graphistry, RAPIDS, and more!

✔️ Linux ✔️ OS X ❌ Windows (#39) Welcome to graph-app-kit Turn your graph data into a secure and interactive visual graph app in 15 minutes! Why This

Graphistry 107 Jan 02, 2023
Self-Supervised depth kalilia

Self-Supervised depth kalilia

24 Oct 15, 2022
Reproduction process of AlexNet

PaddlePaddle论文复现杂谈 背景 注:该repo基于PaddlePaddle,对AlexNet进行复现。时间仓促,难免有所疏漏,如果问题或者想法,欢迎随时提issue一块交流。 飞桨论文复现赛地址:https://aistudio.baidu.com/aistudio/competitio

19 Nov 29, 2022
Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

Pytorch Recurrent Variational Autoencoder Model: This is the implementation of Samuel Bowman's Generating Sentences from a Continuous Space with Kim's

Daniil Gavrilov 347 Nov 14, 2022
Bridging Vision and Language Model

BriVL BriVL (Bridging Vision and Language Model) 是首个中文通用图文多模态大规模预训练模型。BriVL模型在图文检索任务上有着优异的效果,超过了同期其他常见的多模态预训练模型(例如UNITER、CLIP)。 BriVL论文:WenLan: Bridgi

235 Dec 27, 2022
Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai

Coursera-deep-learning-specialization - Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks an

Aman Chadha 1.7k Jan 08, 2023
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

Kingdrone 43 Jan 05, 2023
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020) Official implementation of: Forest R-CNN: Large-Vo

Jialian Wu 54 Jan 06, 2023
KIND: an Italian Multi-Domain Dataset for Named Entity Recognition

KIND (Kessler Italian Named-entities Dataset) KIND is an Italian dataset for Named-Entity Recognition. It contains more than one million tokens with t

Digital Humanities 5 Jun 21, 2022
Simple (but Strong) Baselines for POMDPs

Recurrent Model-Free RL is a Strong Baseline for Many POMDPs Welcome to the POMDP world! This repo provides some simple baselines for POMDPs, specific

Tianwei V. Ni 172 Dec 29, 2022
This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations,

labml.ai Deep Learning Paper Implementations This is a collection of simple PyTorch implementations of neural networks and related algorithms. These i

labml.ai 16.4k Jan 09, 2023
TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently.

Adversarial Chess TensorFlow implementation of Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently. Requirements To run

Muthu Chidambaram 30 Sep 07, 2021
CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

CIFS This repository provides codes for CIFS (ICML 2021). CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Sel

Hanshu YAN 19 Nov 12, 2022
This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their

Liron Bdolah 8 May 22, 2022
Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud

Google Cloud Vertex AI Samples Welcome to the Google Cloud Vertex AI sample repository. Overview The repository contains notebooks and community conte

Google Cloud Platform 560 Dec 31, 2022
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 917 Jan 03, 2023
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

MIT Graphics Group 65 Jan 07, 2023
Implementation of the HMAX model of vision in PyTorch

PyTorch implementation of HMAX PyTorch implementation of the HMAX model that closely follows that of the MATLAB implementation of The Laboratory for C

Marijn van Vliet 52 Oct 13, 2022
code for paper"A High-precision Semantic Segmentation Method Combining Adversarial Learning and Attention Mechanism"

PyTorch implementation of UAGAN(U-net Attention Generative Adversarial Networks) This repository contains the source code for the paper "A High-precis

Tong 8 Apr 25, 2022