当前位置:网站首页>Reppoints: advanced order of deformable convolution
Reppoints: advanced order of deformable convolution
2022-07-06 17:42:00 【Xiaobai learns vision】
Click on the above “ Xiaobai studies vision ”, Optional plus " Star standard " or “ Roof placement ”
Heavy dry goods , First time delivery
I have always appreciated Microsoft's research , Especially deformable convolution , In my opinion, this job is very creative ( I like the idea of deformable convolution ), This time RepPoints This latest paper , Review deformable convolution .
This paper mainly deals with DCNv1、DCNv2、RepPoints Three articles , among RepPoints Think of it as DCNv3. These three articles continue to improve deformable convolution , Improve the geometric deformation modeling ability of the model .
Review of DCNv1 and DCNv2
By scale 、 Posture 、 Geometric changes caused by visual angle and partial deformation are the main challenges of target recognition and detection . In convolution /RoI In the pool module ,DCN The ability of geometric deformation modeling can be obtained by learning the location of sampling points .
Deformable Convolution
The sampling position of the standard convolution is changed by the offset obtained from the input feature learning .
Deformable convolution can be expressed as :
Given a Convolution kernel of sampling points , Expressed as No Weights of sampling points , Expressed as No Predefined offset of sampling points ( for example , and Defined a 3x3 Convolution kernel ). Defined as the input feature location Characteristics of , Defined as the output feature location Characteristics of . For convolution learning, the second Position offset of sampling points . because Is the decimal , therefore By bilinear interpolation .
PS: The resolution of the offset feature is the same as that of the input feature , And the number of channels is twice the number of sampling points ( That is, every position has x and y Offset in both directions ).
Modulated Deformable Convolution
Compared with deformable convolution, it has one more modulation factor .
Modulation deformable convolution can be expressed as :
Expressed as No Modulation factor of sampling points ( The scope is Between ).
PS: The resolution of the modulation factor feature is the same as that of the input feature , And the number of channels is the number of sampling points , The number of channels after adding the offset feature is three times the number of sampling points ( That is, every position has x and y Offset in both directions , There is also a modulation factor ).
Deformable RoI Pooling
Given an input RoI,RoI pooling Divide it into individual bins. First, through RoI pooling Get the pooled feature maps, And then through a fc The layer produces a normalized offset ( This offset can be converted into ).
Single bin The output characteristics of can be expressed as :
Expressed as No individual bin Of the Location of two sampling points , Expressed as No individual bin The number of sampling points of . By bilinear interpolation . For the first time individual bin The offset .
PS:fc The output of the layer is bin Twice as many ( each bin There are x and y Offset in both directions ).
Modulated Deformable RoI Pooling
Single with modulation factor bin The output characteristics of can be expressed as :
For the first time individual bin Modulation factor ( The scope is Between ).
PS: There are two fc layer , the second fc The output of the layer is bin Three times the number ( each bin There are x and y Offset in both directions , There is also a modulation factor ).
RepPoints
Motivation
In the target detection task , The bounding box describes the target position of each stage of the target detector .
Although the bounding box is easy to calculate , But they only provide a rough location of the target , It does not completely fit the shape and posture of the target . therefore , The features extracted from the regular cells of the bounding box may be seriously affected by the invalid information of the background content or the foreground area . This may lead to a reduction in feature quality , Thus, the classification performance of target detection is reduced .
In this paper, a new representation method is proposed , be called RepPoints, It provides finer grained positioning and more convenient classification .
As shown in the figure ,RepPoints It's a set of points , By learning to put yourself adaptively above your goals , This method limits the spatial range of the target , It also represents local areas with important semantic information .
RepPoints The training of is driven by target location and recognition , therefore ,RepPoints And ground-truth The bounding box of is closely related , And guide the detector to correctly classify the target .
Bounding Box Representation
The bounding box is a 4 Wei said , The spatial location of the coding target , namely , Represents the center point , Indicates width and height .
Because of its simple and convenient use , Modern target detectors rely heavily on bounding boxes to represent detection pipeline The objects of each stage in .
The target detector with the best performance usually follows a multi-stage Recognition paradigm of , Among them, the target positioning is gradually refined . among , The roles represented by the goal are as follows :
RepPoints
As mentioned earlier ,4 The dimension bounding box is a rough representation of the target location . The bounding box indicates that only the rectangular space range of the target is considered , Don't think about shape 、 The position of local areas that are important in posture and semantics , These can be used for better positioning and better target feature extraction .
To overcome these limitations ,RepPoints Instead, model a group of adaptive sample points :
among Is the total number of sample points used in the representation . In this work , The default setting is 9.
Learning RepPoints
RepPoints Learning is driven by the loss of target location and target recognition . In order to calculate the target location loss , We first use a conversion function take RepPoints Convert to pseudo frame (pseudo box). then , Calculate the converted pseudo box and ground truth Differences between bounding boxes .
RPDet
The author designed a kind of non use anchor Object detector for , It USES RepPoints Replace the bounding box as the basic representation of the target .
The evolution process of target representation is as follows :
RepPoints Detector (RPDet) It consists of two recognition stages based on deformable convolution , As shown in the figure .
Deformable convolution and RepPoints Well combined , Because its convolution is calculated on a group of irregularly distributed sampling points , in addition , Its classification can guide the positioning of these points in training .
The first offset in the above figure is obtained through diagonal point supervised learning , The second offset is obtained by classification supervision learning on the basis of the previous offset .
To understand from another angle RepPoints:
Deformable convolution is supervised by the final classification branch and regression Branch , Adaptive attention to the appropriate feature location , Extract better features , But what I haven't figured out is whether deformable convolution can really pay attention to the appropriate feature location , The offset learning of deformable convolution is very free , May run away from the target , So are these features really helpful , These problems have been bothering me , I think the intermediate process of deformable convolution is too vague , Too indirect , It's hard to explain . and RepPoints The learning of offset is directly supervised by the supervision signals of positioning and classification , In this way, the offset can be explained , The offset position makes the positioning and classification more accurate ( That is, the offset position can locate the target and the semantic information can identify the target ), In this way, the offset will not run around , And it's explicable .
Think of it in this way ,RepPoints In fact, it is a further improvement of deformable convolution , Compared with deformable convolution, it has two advantages :
1. Learn the offset of deformable convolution through direct supervision of location and classification , Make the offset interpretable .
2. You can directly generate pseudo frames by sampling points (pseudo box), There is no need to learn the bounding box , And classification and positioning are related .
The good news !
Xiaobai learns visual knowledge about the planet
Open to the outside world
download 1:OpenCV-Contrib Chinese version of extension module
stay 「 Xiaobai studies vision 」 Official account back office reply : Extension module Chinese course , You can download the first copy of the whole network OpenCV Extension module tutorial Chinese version , Cover expansion module installation 、SFM Algorithm 、 Stereo vision 、 Target tracking 、 Biological vision 、 Super resolution processing and other more than 20 chapters .
download 2:Python Visual combat project 52 speak
stay 「 Xiaobai studies vision 」 Official account back office reply :Python Visual combat project , You can download, including image segmentation 、 Mask detection 、 Lane line detection 、 Vehicle count 、 Add Eyeliner 、 License plate recognition 、 Character recognition 、 Emotional tests 、 Text content extraction 、 Face recognition, etc 31 A visual combat project , Help fast school computer vision .
download 3:OpenCV Actual project 20 speak
stay 「 Xiaobai studies vision 」 Official account back office reply :OpenCV Actual project 20 speak , You can download the 20 Based on OpenCV Realization 20 A real project , Realization OpenCV Learn advanced .
Communication group
Welcome to join the official account reader group to communicate with your colleagues , There are SLAM、 3 d visual 、 sensor 、 Autopilot 、 Computational photography 、 testing 、 Division 、 distinguish 、 Medical imaging 、GAN、 Wechat groups such as algorithm competition ( It will be subdivided gradually in the future ), Please scan the following micro signal clustering , remarks :” nickname + School / company + Research direction “, for example :” Zhang San + Shanghai Jiaotong University + Vision SLAM“. Please note... According to the format , Otherwise, it will not pass . After successful addition, they will be invited to relevant wechat groups according to the research direction . Please do not send ads in the group , Or you'll be invited out , Thanks for your understanding ~
边栏推荐
- 02个人研发的产品及推广-短信平台
- 04个人研发的产品及推广-数据推送工具
- Total / statistics function of MySQL
- 基本磁盘与动态磁盘 RAID磁盘冗余阵列区分
- 【MySQL入门】第三话 · MySQL中常见的数据类型
- List set data removal (list.sublist.clear)
- 在一台服务器上部署多个EasyCVR出现报错“Press any to exit”,如何解决?
- Run xv6 system
- 虚拟机启动提示Probing EDD (edd=off to disable)错误
- [rapid environment construction] openharmony 10 minute tutorial (cub pie)
猜你喜欢
Flink parsing (IV): recovery mechanism
自动答题 之 Selenium测试直接运行在浏览器中,就像真正的用户在操作一样。
Kali2021 installation and basic configuration
EasyCVR接入设备开启音频后,视频无法正常播放是什么原因?
2021-03-22 "display login screen during recovery" can't be canceled. The appearance of lock screen interface leads to the solution that the remotely connected virtual machine can't work normally
The problem of "syntax error" when uipath executes insert statement is solved
学 SQL 必须了解的 10 个高级概念
Start job: operation returned an invalid status code 'badrequst' or 'forbidden‘
List set data removal (list.sublist.clear)
Uipath browser performs actions in the new tab
随机推荐
PySpark算子处理空间数据全解析(5): 如何在PySpark里面使用空间运算接口
Summary of study notes for 2022 soft exam information security engineer preparation
[elastic] elastic lacks xpack and cannot create template unknown setting index lifecycle. name index. lifecycle. rollover_ alias
PySpark算子处理空间数据全解析(4): 先说说空间运算
The art of Engineering (2): the transformation from general type to specific type needs to be tested for legitimacy
06 products and promotion developed by individuals - code statistical tools
【Elastic】Elastic缺少xpack无法创建模板 unknown setting index.lifecycle.name index.lifecycle.rollover_alias
Flink parsing (V): state and state backend
Xin'an Second Edition; Chapter 11 learning notes on the principle and application of network physical isolation technology
应用服务配置器(定时,数据库备份,文件备份,异地备份)
虚拟机启动提示Probing EDD (edd=off to disable)错误
Start job: operation returned an invalid status code 'badrequst' or 'forbidden‘
Virtual machine startup prompt probing EDD (edd=off to disable) error
Hongmeng introduction and development environment construction
【MySQL入门】第四话 · 和kiko一起探索MySQL中的运算符
Spark calculation operator and some small details in liunx
connection reset by peer
Models used in data warehouse modeling and layered introduction
Spark accumulator and broadcast variables and beginners of sparksql
Grafana 9 is officially released, which is easier to use and more cool!