当前位置:网站首页>Target detection notes fast r-cnn
Target detection notes fast r-cnn
2022-07-28 22:53:00 【leu_ mon】
Faster R-CNN
1.R-CNN
2014 Year by year Ross Girshick In the paper Rich feature hierarchies for accurate object detection and semantic segmentation It is proposed that .

1)RCNN Algorithm flow :
1. A picture is generated 1K~2K Candidate areas ( Use Selective Search Method )

2. For each candidate area , Use deep networks to extract features

3. Features are fed into each type of SVM classifier , Judge whether it belongs to this category


4. Use regression to fine tune candidate box position

2)R-CNN The problem is :
1. The test speed is slow
2. Training is slow
3. Training needs a lot of space
2.Fast R-CNN

When selecting eigenvalues ,R-CNN The eigenvalues of candidate regions need to be calculated repeatedly ,fast R-CNN, Just calculate the eigenvalue of the whole graph , Then the candidate regions are mapped to the feature map , Thus greatly reducing the amount of calculation .
Not all candidate boxes are selected during training , Only a part will be taken , The sampling of training data should have positive samples and negative samples , In the author's paper , One time use 64 Candidate box , There are positive samples and negative samples , The author believes that as long as the candidate box and the real sample box IOU Greater than 0.5 Then it is recognized as a positive sample , Less than 0.5 Of is certified as a negative sample .
Select samples and pass ROI Pooling Layer shrink to 7*7 Size
1) classifier
Output N+1 The probability of categories ,N Types of detection targets , A background , adopt softmax The sum of all the probabilities after the function is 1.
2) Bounding box regressors
Output the corresponding N+1 Candidate bounding box regression parameters for categories ( d x , d y , d w , d h d_x,d_y,d_w,d_h dx,dy,dw,dh), common (N+1)*4 Nodes .

3) Loss function
The loss function is also divided into classification loss and boundary box regression loss .

Classified loss : L c l s ( p , u ) = − l o g p u L_{cls}(p,u)=-log p_u Lcls(p,u)=−logpu That is to say softmax Cross entropy loss
Cross entropy loss (Cross Entropy Loss)
- Many classification problem (softmax Output , all The sum of the output probabilities is 1)
H = − ∑ i o i ∗ l o g ( o i ) H = -\sum_io_i^*log(o_i) H=−∑ioi∗log(oi)
- Two classification problem (sigmoid Output , Every Output nodes do not affect each other )
H = − 1 N ∑ i = 1 N [ o i ∗ l o g o i + ( 1 − o i ∗ ) l o g ( 1 − o i ) ] H = -\frac{1}{N}\sum_{i=1}^N[o_i^*logo_i+(1-o_i^*)log(1-o_i)] H=−N1∑i=1N[oi∗logoi+(1−oi∗)log(1−oi)]
ps: among o i ∗ o_i^* oi∗ Is the true label value , o i o_i oi For the predicted value , Default l o g log log With e e e The bottom is equal to l n ln ln
Bounding box regression loss : [ u > = 1 ] [u>=1] [u>=1] yes Iverson brackets , When u>=1 When is 1, Others are 0.
Iverson brackets represent , When the detection target is the background, the bounding box loss does not exist .
Bounding box regression parameters v i v_i vi Use G i G_i Gi The inverse solution of the formula is sufficient .
4)Fast R-CNN frame

3.Faster R-CNN(RPN+Fast R-CNN)


contrast Fast R-CNN,Faster R-CNN It's just Use one RPN The Internet has replaced SS Candidate box generation algorithm .
1)RPN structure

Acquisition of feature map , Use ZF Convolution network obtained 256 Characteristic diagram of the channel , Use VGG16 Convolution network obtained 512 Characteristic diagram of the channel
according to The original image corresponding to the characteristic image Location ( Here we use the feature graph and the original graph Equal proportion corresponds to that will do , Error allowed , That is, if the proportion is not integer ), frame K Specified size ( Here is Author use Size ) Of anchor boxes.

Through the above steps, we can get nearly 2w individual anchor boxes, Remove crossing edges Is left 6k, There are many boxes with overlapping parts , Use Non maximum suppression ,IOU Set to 0.7, So each picture is left with 2k individual Candidate box .
adopt cls layer Generate 2K A category score ( Background and objects Probability , here Use softmax Many classification , if Use sigmoid The second category is K individual Category score ), adopt reg layer Generate 4K A bounding box regression parameter .

About positive and negative samples : Sample and calibration box IOU Greater than 0.7 Or the biggest One is called Positive sample , Less than 0.3 Of is called Negative sample .
2)RPN Loss function

Cross entropy used for classification loss Calculation ( Select according to the classification using the algorithm Many classification still Two classification ), Border regression loss Calculation Same as Fast R-CNN Boundary box loss calculation Agreement . λ \lambda λ The value in the article is 10, Therefore, the parameters can be approximately merged . Back Fast R-CNN The loss function of the network is the same as before . In the article Use dual networks to train alone , It is usually used directly RPN and Fast R-CNN Lose the method of joint training .
3)Faster R-CNN frame

4) The problem is
- The detection effect of small targets is poor High level abstraction , Resulting in the loss of features .
- The model is big , The detection speed is slow , The main reason is because of two predictions .
边栏推荐
- OSV_ q AttributeError: ‘numpy. ndarray‘ object has no attribute ‘clone‘
- 【三维目标检测】3DSSD(一)
- Differernet [anomaly detection: normalizing flow]
- Detection and tracking evaluation index
- Migration from IPv4 to IPv6
- hp proliant dl380从U盘启动按哪个键
- Qt+ffmpeg environment construction
- Paper reading: deep forest / deep forest /gcforest
- UNET [basic network]
- Intelligent control -- fuzzy mathematics and control
猜你喜欢

Annaconda installs pytoch and switches environments
![Draem+sspcab [anomaly detection: block]](/img/97/75ce235c2021b56007eecb82afe4b0.png)
Draem+sspcab [anomaly detection: block]

842. Arrange numbers
![Labelme labels circular objects [tips]](/img/da/5790d814168b23321ab00a1d17189f.png)
Labelme labels circular objects [tips]

STM32 board level support package for keys

NPM run dev, automatically open the browser after running the project
![[3D target detection] 3dssd (II)](/img/8a/e8927cd868eb99d8880d4f199d8918.png)
[3D target detection] 3dssd (II)

【三维目标检测】3DSSD(一)

记录一下关于三角函数交换积分次序的一道题

STM32 - Basic timer (tim6, tim7) working process, interpretation function block diagram, timing analysis, cycle calculation
随机推荐
投资500亿元!中芯京城正式注册成立!
Use PCL to batch convert point cloud.Bin files to.Pcd
STM32 - Communication
ValueError: Using a target size (torch.Size([64])) that is different to the input size (torch.Size([
Install PCL and VTK under the background of ROS installation, and solve VTK and PCL_ ROS conflict problem
OSV_ q AttributeError: ‘numpy. ndarray‘ object has no attribute ‘clone‘
Migration from IPv4 to IPv6
771. The longest consecutive character in a string
Common library code snippet pytorch_ based【tips】
《结构学》介绍
PCA学习
Draem+sspcab [anomaly detection: block]
无代码开发平台通讯录导出入门教程
歌尔股份与上海泰矽微达成长期合作协议!专用SoC共促TWS耳机发展
Summary of the problem that MathType formula does not correspond in word
CS flow [abnormal detection: normalizing flow]
Qt+ffmpeg environment construction
赋能中国芯创业者!看摩尔精英如何破解中小芯片企业发展难题
How to install and use PHP library neo4j
Summary of C language learning content