当前位置:网站首页>RPN: region proposal networks
RPN: region proposal networks
2022-07-26 14:45:00 【The way of code】
Regional suggestion network (RPN) First, in the faster rcnn It is proposed that .
Get the prediction feature map
The picture is input into the network , Through a series of convolutions +ReLU Got 51×39×256 dimension feature map, Prepare for subsequent selection proposal.
Generate Anchors
anchor It's a fixed size bbox. The specific way is : hold feature map Each point is mapped back to the center of the receptive field of the original image as a reference point , And then select around this benchmark k In different sizes and proportions anchor. about W×H Convolution of size feature map( Usually it is 2400), All in all W×H×k An anchor . By default 3 A scale and 3 A aspect ratio , In every sliding position k=9 individual anchor. stay feature map Each feature point on predicts multiple region proposals. For example, the number of pixels is 51×39 A picture of feature map On will produce 51×39×9 Candidate box . although anchors It is based on the definition of convolution characteristic graph , But in the end anchors It's relative to the original picture .

For each candidate frame of the pixel, it is necessary to determine whether it is a target region , If it is the target area , How to determine the position of its border , The specific process is shown in the figure 2 Shown , stay RPN Head , Generated by the following structure k individual anchor.

Pictured 2 Shown , For the pixel point at a certain position in the feature map , There will be 9 Candidate box . Because input RPN There is 256 Characteristic graph of channels , Therefore, it is necessary to use different... For each channel at the same time 3×3 The sliding window of , Finally, the convolution values of pixels at this position obtained from all channels are added up , Get a new eigenvalue , End use 256 Groups like this 3×3 Convolution kernel , You'll get a new one 256 Dimension vector , This 256 The vector of dimension is used to predict the pixel at that position , Corresponding to the pixel 9 Candidate boxes share this 256 Dimension vector .
256 The dimension vector is followed by two branches , A dichotomy of goals and backgrounds (classification), adopt 1×1×256×18 The convolution kernel of 2k A score ,k Equal to the number of candidate boxes 9, It means 9 individual anchor It's the background score and anchor It's the goal score. If the candidate box is the target area , Determine the position of the candidate box in the target area , At this time, the other branch will pass 1×1×256×36 The convolution kernel of 4k A coordinate , Each box contains 4 A coordinate (x,y,w,h), Namely 9 The specific position of the box corresponding to the candidate areas should be offset Δxcenter,Δycenter,Δwidth,Δheight. If the candidate box is not the target area , Just remove the candidate box , Do not judge the subsequent position information .

The taxonomic Branch
Examine each image in the training set ( With manual calibration gt box) All of the anchor Divide positive and negative samples :
(1) For each calibration gt box Area , The one with the largest overlap anchor Record as positive sample , Make sure that each gt At least one positive sample anchor
(2) Yes (1) In the rest of the anchor, If it overlaps a calibration area by more than 0.7, Record as positive sample ( Every gt It may correspond to multiple positive samples anchor. But every positive sample anchor It can only correspond to one gt; If the overlap ratio with any calibration is less than 0.3, Record as negative sample .
Back to branch
x,y,w,h respectively box Center coordinates and width height of the ,x,,x respectively predicted box, anchor box, and ground truth box (y,w,h Empathy ) Express predict box be relative to anchor box The migration , Express ground true box be relative to anchor box The migration , The learning goal is to make the former approach the value of the latter .
stay RPN In the middle , The taxonomic Branch (cls) And border regression branch (bbox reg) Separate this pile of anchor Do all kinds of calculations . stay RPN end , By summarizing the results of the two branches , To achieve the right anchor Preliminary screening of ( First eliminate those that cross the boundary anchor, According to cls The results are suppressed by non maxima (NMS) Algorithm de duplication ) And preliminary offset ( according to bbox reg result ), At this point, the output is bbox It's called Proposal 了
The offset formula is as follows .An Namely anchor Box of ,pro Is to finally get the boundary box after regression , Here we are proposals Just choose :
Non maximum suppression (Non-maximum suppression)
because anchor There is usually overlap overlap, therefore , identical object Of proposals There is also overlap . To solve the overlap proposal problem , use NMS Algorithm to deal with : Two proposal between IoU Greater than the preset threshold , Then discard score Lower proposal.
IoU The presetting of the threshold value needs to be handled carefully , If IoU It's too small , May be lost objects Some of proposals; If IoU Overvalued , May lead to objects There are many proposals.IoU Typical values for 0.6.
Proposal choice
NMS After processing , according to sore Yes top N individual proposals Sort . stay Faster R-CNN In the paper N=2000, Its value can also be smaller , Such as 50, Still get good results .
Learn more about programming , Please pay attention to my official account :

边栏推荐
- AMB | towards sustainable agriculture: rhizosphere microbial engineering
- Image-Level 弱监督图像语义分割汇总简析
- 【使用工具条绘图】
- SiamRPN++:深层网络连体视觉跟踪的演变
- Tdengine helps Siemens' lightweight digital solution simicas simplify data processing process
- 『SignalR』.NET使用 SignalR 进行实时通信初体验
- 过滤器和拦截器的区别
- Fill in the questionnaire and receive the prize | we sincerely invite you to fill in the Google play academy activity survey questionnaire
- Matlab solution of [analysis of variance]
- [draw with toolbar]
猜你喜欢

RPN:Region Proposal Networks (区域候选网络)

Introduction to C language must brush the daily question of the collection of 100 questions (1-20)

GOM登录器配置免费版生成图文教程

31. Opinion-based Relational Pivoting forCross-domain Aspect Term Extraction 阅读笔记

Error reported by Nacos enabled client

Image-Level 弱监督图像语义分割汇总简析

AMB | 迈向可持续农业:根际微生物工程

PyTorch中 nn.Conv2d与nn.ConvTranspose2d函数的用法

当AI邂逅生命健康,华为云为他们搭建三座桥

嵌入式开发:调试嵌入式软件的技巧
随机推荐
Winscp transfer file and VNC connection problem
"Baget" takes you one minute to build your own private nuget server
Siamrpn++: evolution of deep network connected visual tracking
全校软硬件基础设施一站式监控 ,苏州大学以时序数据库替换 PostgreSQL
When AI encounters life and health, Huawei cloud builds three bridges for them
Seata deployment and microservice integration
SP export map to Maya
Lingo软件的使用
[dry goods] data structure and algorithm principle behind MySQL index
CAS based SSO single point client configuration
JS creative range select drag and drop plug-ins
[2022 national game simulation] Bai Loujian - Sam, rollback Mo team, second offline
Create root permission virtual environment
SiamRPN:建议区域网络与孪生网络
PyTorch中 nn.Conv2d与nn.ConvTranspose2d函数的用法
当AI邂逅生命健康,华为云为他们搭建三座桥
堆叠降噪自动编码器 Stacked Denoising Auto Encoder(SDAE)
Annotation and reflection
自编码器 AE(AutoEncoder)程序
Use cpolar to build a commercial website (apply for website security certificate)