当前位置:网站首页>CVPR 2022 | pttr: 3D point cloud target tracking based on transformer
CVPR 2022 | pttr: 3D point cloud target tracking based on transformer
2022-06-09 11:35:00 【3D vision workshop】
Click on the above “3D Visual workshop ”, choice “ Star standard ”
The dry goods arrive at the first time

Author luozhipeng
Source: Shangtang Academy

Reading guide
stay CVPR 2022 On , Shangtang Research Institute team proposed based on Transformer Of 3D Point cloud tracking model PTTR.PTTR Firstly, in the feature extraction stage, feature association is proposed to sample to save more points related to the tracked object , Then the point cloud association is designed Transformer Module for point cloud feature matching . Last ,PTTR A lightweight prediction correction module is proposed to further improve the accuracy of prediction . Experimental results show that PTTR Significant accuracy improvement over multiple data sets .
Title of thesis : PTTR: Relational 3D Point Cloud Object Tracking with Transformer

Problems and challenges
Target tracking is a basic computer vision task , It has been widely studied on image data . In recent years , With the development of Radar Technology , Point cloud based target tracking has also received more attention . Point cloud data has some unique challenges , For example, the sparsity of point clouds 、 Occlusion and noise . These characteristics make it impossible for us to directly use image-based algorithms for tracking , At present, the tracking algorithm based on point cloud has not been fully studied . One of the major challenges of point cloud tracking is when the object is far away from the sensor , Sparse point cloud will bring great difficulty to tracking . in addition , The existing point cloud tracking algorithms mainly use the linear method of cosine similarity to match the features , There's a lot of room for improvement .
Methods to introduce
In view of the above questions , We propose a novel point cloud tracking framework , As shown in the figure below . The model is divided into three stages : In the feature extraction stage , We propose a new relationship aware sampling method (Relation-Aware Sampling), The feature relationship between the template and the search area is used for sampling , So as to save more former scenic spots . In the feature matching stage , We propose a point cloud relationship Transformer structure (Point Relation Transformer), Effectively match the features of the template and the search area . Finally, we propose a prediction correction module (Prediction Refinement Module), The accuracy of prediction is further improved by feature sampling .

1. Relationship aware sampling (Relation-Aware Sampling)
The sparsity of point cloud is a big challenge of tracking algorithm , The feature extraction of point cloud is usually accompanied by the process of down sampling . Most of the existing tracking algorithms use random sampling , During the sampling process, a large number of former scenic spots will be lost in the search area , It is not conducive to subsequent feature matching . So we propose relationship aware sampling , The feature distance between the template and the search area is used for sampling . Because the template area is mostly composed of point clouds on the target object , So we sample those points in the search area where the feature distance and template are as small as possible , You can get as many front attractions as possible . As shown in the figure below , We compared different sampling methods , The sampling points are located in 3 The scale in the dimension target box , It is obvious that our proposed relational perceptual sampling maximizes the former scenic spots .

2. Relationship enhancement matching (Relation-Enhanced Feature Matching)
In tracking problems , The search area needs to match our template , Most existing 3D The single target tracking algorithm adopts the characteristic cosine distance , It is considered that the points with small cosine distance have high matching degree . The difference is , Based on the successful application of attention mechanism in computer vision , A relationship based attention mechanism is designed , To match the template and the point cloud of the search area . As shown in the figure below , The attention module we designed makes use of offset-attention, take query,key,value Feature fusion , The nonlinearity is introduced through the activation layer . say concretely , Let's go through a self-attention Module to process the template and search area point cloud respectively , Then we use the search area point cloud as query, The point cloud in the template area is used as key and value, Input to a cross-attention, The point cloud features of the search area after matching are obtained .

3. Prediction from coarse to fine (Coarse-to-Fine Tracking Prediction)
Most existing 3D Single target tracking algorithms are simply used 3D The prediction module of the detector , for example Votenet,RPN etc. . We believe that similar detection and prediction modules inevitably introduce redundant calculations , It leads to a decrease in efficiency . Therefore, we propose a new prediction correction module , The module passes from the template point cloud , Search point cloud , The fused search point cloud respectively takes out the corresponding point cloud features , Combine them and directly predict . Essentially , We let each point of the search area , Predict a through the characteristics of different stages proposal. stay inference Stage , We will be the one with the highest score proposal As a result of prediction .
4. Data sets
In addition to methodological contributions , We also propose a method based on Waymo Open Dataset New large-scale point cloud tracking data set . because Waymo Each target is marked with the corresponding ID, So you can extract a ID Location information at different times , Based on this , We made it Waymo Single target tracking data set , As shown in the following table , We made Waymo Tracking data sets far exceed the amount of data KITTI, It provides a platform for further research on big data sets baseline.

5. experiment
We are KITTI, Waymo The data sets are compared with PTTR And other models , As shown in the following table , You can see PTTR Advantages over existing methods .


In order to verify the effect of each module , We conducted various ablation experiments , The experimental results also verify the effectiveness of each module we propose .


Conclusion
In this paper , We came up with a new one 3D Point cloud tracking model . It uses relational sensing sampling to alleviate the problem of sparse point cloud , utilize Transformer Attention mechanism to complete effective feature matching , And local feature sampling is used to further improve the prediction accuracy . Experiments show that our proposed method effectively improves the performance of point cloud tracking .
Portal
PTTR The relevant code of has been open source , Welcome to use and exchange .
Address of thesis
https://arxiv.org/pdf/2112.02857.pdf
Project address
https://github.com/Jasonkks/PTTR
This article is only for academic sharing , If there is any infringement , Please contact to delete .
3D Visual workshop boutique course official website :3dcver.com
1. Multi sensor data fusion technology for automatic driving field
2. For the field of automatic driving 3D Whole stack learning route of point cloud target detection !( Single mode + Multimodal / data + Code )
3. Thoroughly understand the visual three-dimensional reconstruction : Principle analysis 、 Code explanation 、 Optimization and improvement
4. China's first point cloud processing course for industrial practice
5. laser - Vision -IMU-GPS The fusion SLAM Algorithm sorting and code explanation
6. Thoroughly understand the vision - inertia SLAM: be based on VINS-Fusion The class officially started
7. Thoroughly understand based on LOAM Framework of the 3D laser SLAM: Source code analysis to algorithm optimization
8. Thorough analysis of indoor 、 Outdoor laser SLAM Key algorithm principle 、 Code and actual combat (cartographer+LOAM +LIO-SAM)
10. Monocular depth estimation method : Algorithm sorting and code implementation
11. Deployment of deep learning model in autopilot
12. Camera model and calibration ( Monocular + Binocular + fisheye )
13. blockbuster ! Four rotor aircraft : Algorithm and practice
14.ROS2 From entry to mastery : Theory and practice
15. The first one in China 3D Defect detection tutorial : theory 、 Source code and actual combat
blockbuster !3DCVer- Academic paper writing contribution Communication group Established
Scan the code to add a little assistant wechat , can Apply to join 3D Visual workshop - Academic paper writing and contribution WeChat ac group , The purpose is to communicate with each other 、 Top issue 、SCI、EI And so on .
meanwhile You can also apply to join our subdivided direction communication group , At present, there are mainly 3D Vision 、CV& Deep learning 、SLAM、 Three dimensional reconstruction 、 Point cloud post processing 、 Autopilot 、 Multi-sensor fusion 、CV introduction 、 Three dimensional measurement 、VR/AR、3D Face recognition 、 Medical imaging 、 defect detection 、 Pedestrian recognition 、 Target tracking 、 Visual products landing 、 The visual contest 、 License plate recognition 、 Hardware selection 、 Academic exchange 、 Job exchange 、ORB-SLAM Series source code exchange 、 Depth estimation Wait for wechat group .
Be sure to note : Research direction + School / company + nickname , for example :”3D Vision + Shanghai Jiaotong University + quietly “. Please note... According to the format , Can be quickly passed and invited into the group . Original contribution Please also contact .

▲ Long press and add wechat group or contribute

▲ The official account of long click attention
3D Vision goes from entry to mastery of knowledge : in the light of 3D In the field of vision Video Course cheng ( 3D reconstruction series 、 3D point cloud series 、 Structured light series 、 Hand eye calibration 、 Camera calibration 、 laser / Vision SLAM、 Automatically Driving, etc )、 Summary of knowledge points 、 Introduction advanced learning route 、 newest paper Share 、 Question answer Carry out deep cultivation in five aspects , There are also algorithm engineers from various large factories to provide technical guidance . meanwhile , The planet will be jointly released by well-known enterprises 3D Vision related algorithm development positions and project docking information , Create a set of technology and employment as one of the iron fans gathering area , near 4000 Planet members create better AI The world is making progress together , Knowledge planet portal :
Study 3D Visual core technology , Scan to see the introduction ,3 Unconditional refund within days

There are high quality tutorial materials in the circle 、 Answer questions and solve doubts 、 Help you solve problems efficiently
Feel useful , Please give me a compliment ~
边栏推荐
- RDMA Verbs API
- P1110 [ZJOI2007]报表统计
- Tidb cloud launched Google cloud marketplace, empowering global developers with a new stack of real-time HTAP databases
- Jingzhida rushes to the scientific innovation board: the annual revenue is 458million, and the SME fund is the shareholder
- 基于华为云君可归烈士寻亲系统开发实战【华为云至简致远】
- MySQL 学习笔记-第五篇-数据备份与恢复、MySQL 日志
- flutter setState() called after dispose()
- Quartz多个调度器+线程池模式分别调度任务
- ref引用用法
- 现代社会,人们对半导体产品依赖的程度越来越高
猜你喜欢

音乐创作工具Steinberg Cubase Pro

最新版,最新资料

对象的实例化和访问

处理链加载数据出错的可能原因-process chain loading error

Use the five number generalization method to determine the outliers in the data set

建议收藏:数据标准的概念,分类,价值及6大实施步骤解析

首家BMW i品牌专属体验店开业,全面展示宝马电动产品的魅力

现代社会,人们对半导体产品依赖的程度越来越高

百度 90 后程序员删改数据库被判刑,其称因对领导不满

Clunky hero v0.96 Chinese version
随机推荐
GaussDB(DWS) 之数据迁移【这次高斯不是数学家】
Quartz multiple schedulers + thread pool mode to schedule tasks separately
终于有人把大数定律讲明白了
Matlab related function knowledge points (III) -floor function + dot division operator + matrix index rules
电脑的选择1
No provider available for the service
Record of a memory leak
MySQL learning notes - Part 3 - indexes, stored procedures and functions, views, triggers
CTFshow之web171~180---SQL注入(1)
[buuctf.reverse] 111_ [b01lers2020]chugga_ chugga
音乐创作工具Steinberg Cubase Pro
[buuctf.reverse] 109_ [FlareOn6]FlareBear,110_ [INSHack2018]Tricky-Part1
[buuctf.reverse] 115_ [RCTF2019]DontEatMe
Clunky hero v0.96 Chinese version
三维数字沙盘展示具备哪些应用优势
首家BMW i品牌专属体验店开业,全面展示宝马电动产品的魅力
【Homeassistant驱动舵机servo】
Gaussdb (DWS) data migration [Gauss is not a mathematician this time]
第四讲:数据仓库搭建(二)
[advanced MySQL] optimize SQL by using the execution plan explain (2)