当前位置:网站首页>Cvpr2022 | panopticdepth: a unified framework for depth aware panoramic segmentation
Cvpr2022 | panopticdepth: a unified framework for depth aware panoramic segmentation
2022-06-29 13:09:00 【CV technical guide (official account)】
Preface In this paper, we propose a panoramic segmentation method based on depth perception (DPS) A unified framework for , It aims to reconstruct a 3D scene with instance level semantics from an image . This framework applies dynamic convolution technique to panoramic segmentation (PS) And depth prediction tasks , To generate an instance specific kernel to predict the depth and segmentation mask of each instance . Besides , Using the case level depth estimation scheme , Added additional instance level depth cues , To help monitor deep learning through new depth loss .
Welcome to the official account CV Technical guide , Focus on computer vision technology summary 、 The latest technology tracking 、 Interpretation of classic papers 、CV Recruitment information .

The paper :PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmentation
The paper :http://arxiv.org/pdf/2206.00468
Code :https://github.com/NaiyuGao/PanopticDepth.
background
Depth aware panoramic segmentation (DPS) It is a new challenging task in scene understanding , It attempts to construct a three-dimensional scene with instance level semantic understanding from a single image .
DPS A simple solution is in panoramic segmentation (PS) Add a dense depth regression header to the network , Generate a depth value for each marked pixel , This method is intuitive but suboptimal .
Because it uses two separate branches to handle these two tasks , So it did not explore the mutually beneficial relationship between them , In particular, there is no convenient instance level semantic clues to improve the depth accuracy .
in addition , The author observed , Pixels of adjacent instances usually have discontinuous depths . for example , Two cars in a line may have different depths . therefore , It is difficult to predict the exact depth of two vehicles using the same pixel depth regression .
On the other hand , The authors consider that these pixels come from different vehicles , If separate regressors are used , Is conducive to depth estimation .
According to the above ideas , In this paper, the author proposes a unified method that can predict mask and depth values in the same instance PanopticDepth Model framework ( Pictured 1).

chart 1 An example of a unified solution for depth aware panoramic segmentation
contribution
1. An example specific dynamic convolution kernel technique is proposed to unify the depth estimation and panoramic segmentation methods , This improves the performance of these two tasks .
2. To simplify depth estimation , Inspired by batch normalization , It is proposed to represent each instance depth graph as a triple , Normalized depth map 、 Depth range and depth offset , Normalize the value of the original instance depth map to [0,1], To improve learning efficiency .
3. Based on the new depth map representation ( Such as depth offset ) Added instance level depth Statistics , To enhance in-depth monitoring . In order to adapt to this new supervision , The corresponding depth loss , To improve depth prediction .
Method
A unified depth aware panoramic segmentation model is proposed PanopticDepth, It predicts mask and depth values in the same way as the example . In addition to backbone and feature pyramid networks , It also includes three sub networks , Including a kernel generator for generating instance classes 、 Instance specific mask and depth convolution kernel 、 Panoramic segmentation model for generating instance mask and instance depth map generator for estimating instance depth . The network architecture is shown in the figure 2 Shown .

chart 2 PanopticDepth frame
1. Kernel generator
Generate instance classification through kernel generator sub network 、 Mask convolution kernel and depth estimation kernel ( chart 2 The top half of ). The kernel generator is based on the most advanced panoramic segmentation model PanopticFCN, The model adopts PS Dynamic convolution technique , Compared with other latest methods , Training time required and GPU Less memory .
The kernel generator adopted by the author is divided into two stages: kernel generator and kernel fusion . In the kernel generator phase , take FPN pass the civil examinations i A single phase feature of the phase is used as input , The generator generates a kernel weight map , And two position mappings generated for the object and the object respectively , Given each FPN Position diagram and kernel weight diagram of the stage , In the nuclear fusion phase , Merge multiple FPN The repeated kernel weight of the stage , Through the proposed adaptive kernel fusion (AKF) The operation realizes .
2. Panoramic segmentation
An instance specific kernel method is used to perform panoramic segmentation , Pictured 2 Shown at the bottom .thing and stuff The mask of the instance M It is obtained by convolution shared high-resolution mask embedding mapping ∈ , The mask core is , Then proceed Sigmoid Activate :

First discard the redundant instance mask . then , Match all remaining instance masks with argmax Merge , To generate non overlapping panoramic segmentation results , So each pixel is assigned to a thing or fill segment , None of the pixels are marked as “VOID”.
Besides , The author also proposes an additional training process , That is, fine tune the learning model on the full image scale , But the batch size is small . To bridge the performance gap between training and testing .
3. Case based depth estimation
The depth of each instance is predicted by the same instance specific kernel technique used in panoramic segmentation , This technique unifies depth estimation and panoramic segmentation . Pictured 2 As shown in the middle of , First, run the depth kernel on the depth embedding to generate the instance depth map , Then these individual images are combined according to the panoramic segmentation results to generate the final overall depth map .
3.1 Depth generator
Given the instance specific depth kernel Kd And shared deep embedding Ed, Similar to the instance mask generation process , By convolution and Sigmoid Activate to generate normalized instance depth map D', And then by the equation 4 Or equation 5 Denormalize it as a depth map D:

The depth map D′ Only the relative depth values in each instance are encoded , So it's easier to learn . Besides , Two normalization schemes have been developed , That's the formula 4 And the formula 5, And found that the latter is better .
After obtaining all instance depth maps , According to the non overlapping panoramic segmentation mask M Aggregate them into a complete image depth map . This produces an exact depth value at the instance boundary .
3.2 Depth loss
The depth loss function is developed based on the combination of proportional invariant logarithmic error and relative square error , as follows :

Due to the case-based depth estimation method , The author learns depth prediction under traditional pixel level monitoring and additional instance level monitoring , This improves the depth accuracy empirically . In order to achieve double Supervision , Final depth loss Ldep Including two loss items . One is pixel level depth loss , The other is instance level depth loss :

experiment
surface 1: Verification of urban landscape and panoramic segmentation results of test set .”AKF:“ Adaptive kernel fusion ”FSF: Overall fine tuning

surface 2: Urban landscape DPS Depth aware panoramic segmentation results on

surface 3: Urban landscape DPS The study of ablation .”IDE“: Instance depth estimation ”IDN“: Instance depth normalization

surface 4: The monocular depth estimation method of urban landscape uses panoramic segmentation annotation

chart 3: Pixel level depth estimation outputs a smooth value at the boundary of two instances , Instance level depth estimation can generate more reasonable discontinuous depth values

chart 4:PanopticDepth Prediction examples of the model

Conclusion
This paper proposes a unified depth aware panoramic segmentation framework , Generate an instance specific kernel to predict the depth and segmentation mask for each instance .
Dynamic kernel technology is used to introduce high-level target information into depth estimation , The depth map of each instance is normalized using depth offset and depth range , To simplify sharing deep embedded learning .
Besides , This paper also proposes a new depth loss method to supervise the deep learning of instance level depth cues . In the urban landscape DPS and SemKITTI DPS Experiments on benchmark show the effectiveness of this method .
Looking for a friend who is very familiar with object detection , A summary of target detection from traditional methods to deep learning , It mainly includes traditional method detection 、RCNN series 、YOLO series 、anchor-free series 、 Summary of small target detection methods 、 Summary of small sample target detection methods 、 Summary of object detection methods in video 、 Summary of loss function used in target detection . Support writing while learning . There are certain royalties and benefits , Please contact me for details ( Scan the QR code in the link ). Similarly, it also includes image segmentation 、Transformer Wait for the direction .
CV The technical guide creates a computer vision technology exchange group and a free version of the knowledge planet , At present, the number of people on the planet has 600+, The number of topics reached 200+.
The knowledge planet will release some homework every day , It is used to guide people to learn something , You can continue to punch in and learn according to your homework .
Every day in the technology group, the top conference papers published in recent days will be sent , You can choose the papers you are interested in to read , continued follow Latest technology , If you write an interpretation after reading it and submit it to us , You can also receive royalties .
in addition , The technical group and my circle of friends will also publish various periodicals 、 Notice of solicitation of contributions for the meeting , If you need it, please scan your friends , And pay attention to .
Add groups and planets : Official account CV Technical guide , Get and edit wechat , Invite to join .
Welcome to the official account CV Technical guide , Focus on computer vision technology summary 、 The latest technology tracking 、 Interpretation of classic papers 、CV Recruitment information .
Other articles
Introduction to computer vision
CVPR2022 | iFS-RCNN: An incremental small sample instance divider
CVPR2022 | A ConvNet for the 2020s & How to design neural network Summary
CVPR2022 | PanopticDepth: A unified framework for depth aware panoramic segmentation
CVPR2022 | Reexamine pooling : Your feeling field is not ideal
CVPR2022 | Unknown target detection module STUD: Learn about unknown targets in the video
CVPR2022 | Ranking based siamese Visual tracking
CVPR2022 | Through target perception Transformer Distillation of knowledge
CVPR2022 Video scene segmentation under unsupervised pre training
Build from scratch Pytorch Model tutorial ( Four ) Write the training process -- Argument parsing
Build from scratch Pytorch Model tutorial ( 3、 ... and ) build Transformer The Internet
Build from scratch Pytorch Model tutorial ( Two ) Build network
Build from scratch Pytorch Model tutorial ( One ) data fetch
A thermal map visualization code tutorial
Some personal thinking habits and thought summary about learning a new technology or field quickly
边栏推荐
- File contained log poisoning (user agent)
- [environment configuration]pwc-net
- C # clue binary tree through middle order traversal
- Precautions for Beifu controller connecting Panasonic EtherCAT servo
- Go learning - build a development environment vscode development environment golang
- C#通過中序遍曆對二叉樹進行線索化
- C#实现图的邻接矩阵和邻接表结构
- C # implementation of binary tree non recursive middle order traversal program
- 从零搭建Pytorch模型教程(五)编写训练过程--一些基本的配置
- 超 Nice 的表格响应式布局小技巧
猜你喜欢

如何计算win/tai/loss in paired t-test

别再重复造轮子了,推荐使用 Google Guava 开源工具类库,真心强大!

Don't build the wheel again. It is recommended to use Google guava open source tool class library. It is really powerful!

Schiederwerk power supply maintenance smps12/50 pfc3800 analysis

C#实现图的邻接矩阵和邻接表结构

Cvpr2022 𞓜 thin domain adaptation

Aes-128-cbc-pkcs7padding encrypted PHP instance

Matlab简单入门

从零搭建Pytorch模型教程(五)编写训练过程--一些基本的配置

cnpm报错‘cnpm‘不是内部或外部命令,也不是可运行的程序或批处理文件
随机推荐
Aes-128-cbc-pkcs7padding encrypted PHP instance
qt 自定义控件 :取值范围
神经网络各个部分的作用 & 彻底理解神经网络
File contained log poisoning (user agent)
InDesign插件-常规功能开发-JS调试器打开和关闭-js脚本开发-ID插件
Hutool tool class learning (continuous update)
clickhouse数据库使用jdbc存储毫秒和纳秒
C#实现二叉树的层次遍历
Don't build the wheel again. It is recommended to use Google guava open source tool class library. It is really powerful!
Cvpr2022 | knowledge distillation through target aware transformer
如何計算win/tai/loss in paired t-test
LeetCode_双指针_中等_328.奇偶链表
超 Nice 的表格响应式布局小技巧
安装typescript环境并开启VSCode自动监视编译ts文件为js文件
OPC of Beifu twincat3_ UA communication test case
Lm07 - detailed discussion on cross section strategy of futures
@Table爆红
QT custom control: value range
CVPR2022 | 长期行动预期的Future Transformer
Hystrix断路器
