当前位置:网站首页>Ga-rpn: recommended area network for guiding anchors
Ga-rpn: recommended area network for guiding anchors
2022-07-26 11:47:00 【The way of code】
Address of thesis :https://arxiv.org/pdf/1901.03278.pdf
Code address :GitHub - open-mmlab/mmdetection: OpenMMLab Detection Toolbox and Benchmark
1.RPN
RPN namely Region Proposal Network, Yes, it is RON To select the region of interest , namely proposal extraction. for example , If a region's p>0.5, It is thought that there may be 80 One of the categories , It's not clear what kind it is . Only this and nothing more , The network only needs to select these areas that may contain objects , These selected areas are also called ROI(Region of Interests), That is, the region of interest . Of course RPN At the same time feature map Frame these ROI Approximate location of the region of interest , The output Bounding Box.
RPN Detailed introduction :https://mp.weixin.qq.com/s/VXgbJPVoZKjcaZjuNwgh-A
2.Guided Anchoring
Usually use (x,y,w,h) To describe a anchor, That is, the coordinates of the center point and the width and height . The article will anchor The distribution of is expressed by conditional probability , Formula for :
The distribution of two conditional probabilities , After representing a given image feature anchor Of Central point probability distribution , And after the given image feature and center point Shape probability distribution . So it looks like , So what we got anchor The method of can be regarded as a special case of the above conditional probability distribution , namely p(x,y|I) Is evenly distributed and p(w,h|x,y,I) Is the impulse function .
According to the formula above ,anchor The generation process of can be divided into two steps ,anchor Position prediction and shape prediction .
The methods used in this paper are as follows :

This framework is in the original RPN Based on the characteristic diagram of , Use two scores to predict anchor The location and shape of , And then combine them to get anchor. Then use a Feature Adaption The module carries out anchor Adjustment of features , Get a new feature map for future prediction (anchor Classification and regression of ). The whole method can be trained end-to-end , And it's only increased compared with before 3 individual 1×1 conv And a 3×3 deformable conv, The change of model parameters is very small .
(1) Location prediction
The goal of the location prediction branch is to predict which areas should be generated as the center anchor, It is also a binary classification problem , But it's different from RPN The classification of , We don't predict whether each point is a prospect or a background , It's about predicting whether the center of the object is .
We will take the whole feature map The area is divided into the central area of the object 、 Peripheral area and ignore area , The general idea is to groundtruth A small piece in the center of the box corresponds to feature map The area on the is marked as the central area of the object , In training as Positive sample , Other areas are marked as Ignore perhaps Negative sample . Finally, by selecting the position where the corresponding probability value is higher than the predetermined threshold, the area where the object may be active is determined . Use 1×1 Convolution of , Get and Output of the same resolution , Get the value of each position of the output to represent the original figure I The possibility of objects appearing in the corresponding position on , That is, probability diagram , Finally, by selecting the position where the corresponding probability value is higher than the predetermined threshold, the area where the object may be active is determined .
By location prediction , We can filter out a small area as anchor Candidate center point location of , bring anchor The number is greatly reduced . In this way, in the end, we can only aim at anchor Calculate where .
(2) Shape prediction
Shape prediction branch is the target is given anchor Center point , Predict the best length and width , This is a question of return .
use 1×1 Convolution network of Input , Output and Of the same size 2 Characteristic diagram of the channel , Each channel represents dw and dh, Indicates the best possible for each location anchor Size . Although our prediction goal is w and h, But the direct prediction of these two figures is unstable , Because the scope is very large , So approximate space [0,1000] Mapped to [-1,1] in , Formula for :
among s It's stride ,σ It's an empirical factor , In the experiment, take σ=8. In the experiment dw,dh Two channel mapping of , Pixel by pixel conversion is achieved through this equation . Use directly in the article IOU Learn as a supervisor w and h.
about anchor and ground truth Matching problems , Tradition RPN Are all direct calculations anchor And all ground truth Of IOU, And then anchor Match to IOU The biggest one ground truth, But now due to our improvement ,anchor Of w and h It's all uncertain , Is a variable that needs to be predicted . In this article anchor And some ground truth Of IOU Expressed as :
We can't put all the possible w and h Go through it and find IOU The maximum of , In this paper, a new method is used 9 Group possible w and h As a sample , The approximate effect is enough .
Here we can generate anchor 了 . Generated at this time anchor It's sparse and each position is different . The experiment can get the average at this time recall It has surpassed the ordinary RPN 了 , Only two more conv.

(3) Feature fine tuning module
Because the shape of each position is different , Big anchor Corresponding to the larger receptive field , Small anchor Corresponding to the small receptive field . So it can't be based on anchor That's right feature map Perform convolution to predict , But to feature map Conduct feature adaptation. The author uses deformable convolution (deformable convolution) Thought , Each position is converted separately according to the shape .

The way is to put anchor The shape information of is directly integrated into the feature map , Get a new feature map to adapt to each position anchor The shape of the . Here we use the above 3×3 The deformable convolution of is used to modify the original feature map , The variation of deformable convolution is through anchor Of w and h Through a 1×1 conv Got .
among ,fi It's No i A feature of location ,(wi, hi) Is the corresponding anchor shape .NT adopt 3×3 Implementation of deformation convolution . First, the offset field is predicted by the shape prediction Branch offset field, Then the original with offset feature map Do deformation convolution to get adapted features. Then further classify and bounding box Return to .
By doing this , Reached the goal of making feature The effective scope and anchor The shape is closer to the purpose , The same conv Different positions of can also represent different shapes and sizes anchor 了 .
Examples of experimental results in this paper :

Learn more about programming , Please pay attention to my official account :

边栏推荐
- [ten thousand words long text] Based on LSM tree thought Net 6.0 C # realize kV database (case version)
- Data center construction (II): brief introduction to data center
- 了解 useRef 一篇就够了
- 建模杂谈系列151 SCLC工程化实验4-SCLC对象
- Initial learning experience of SQL Server database
- 初试YOLOv7
- Win10 uses NVM to install node, NPM, and cnpm
- Modeling essay series 150 SCLC engineering experiment 3-srule
- "Mongodb" mongodb high availability deployment architecture - replica set
- Esp8266 Arduino programming example - development environment construction (based on platformio)
猜你喜欢

MICCAI2022论文 | 进化多目标架构搜索框架:在COVID-19三维CT分类中的应用

正点原子stm32中hal库iic模拟`#define SDA_IN() {GPIOB->MODER&=~(3<<(9*2));GPIOB->MODER|=0<<9*2;}` //PB9 输入模式

swagger2.9.2教程 与swagger3.0.0教程

3.2 创建菜单与游戏页面(下)

28.文件目录解析代码实现

大咖观点+500强案例,软件团队应该这样提升研发效能!

ESP8266-Arduino编程实例-开发环境搭建(基于PlatformIO)

js使用WebUploader做大文件的分块和断点续传

System call capture and analysis conclusion making system call log collection system

Wulin headlines - station building expert competition
随机推荐
【万字长文】使用 LSM-Tree 思想基于.Net 6.0 C# 实现 KV 数据库(案例版)
武林头条-建站小能手争霸赛
如何使用数据管道实现测试现代化
MILA旗下初创企业Ubenwa获得250万美元投资,研究婴儿健康AI诊断
Wechat applet - Advanced chapter Lin UI component library source code analysis button component (I)
Outsourcing for four years, abandoned
702 horsepower breaks through 100 in only 4.5 seconds! The strongest pickup truck comes, safe and comfortable
Meiker Studio - Huawei 14 day Hongmeng equipment development practical notes 8
初试YOLOv7
Preliminary test yolov7
Modeling essay series 151 SCLC engineering experiment 4-sclc object
[error reported]exception: found duplicate column (s) in the data schema: `value`;
Substance Painter 2021软件安装包下载及安装教程
.....
Recalling Sister Feng
ESP8266-Arduino编程实例-开发环境搭建(基于Arduino IDE)
微服务化解决文库下载业务问题实践
ESP8266-Arduino编程实例-开发环境搭建(基于PlatformIO)
五万美元的年薪是如何花光的
Programmer growth chapter 28: how can managers not do it by themselves?