当前位置:网站首页>Yolov2 learning and summary
Yolov2 learning and summary
2022-07-03 06:25:00 【Happy breeder】
Preface :《YOLOV1 Study and summarize 》 I learned from the last article yolov1, It's very rewarding , got it yolov1 Why is the recall rate low , Why the accuracy is low and so on .yolov2 It can be said that the accuracy and speed are solved yolov1 Some obvious problems in . This article interprets the great God's thesis with you .
Catalog
1.3 introduce Anchor Mechanism
2.1 Put forward based on darknet19 Feature extraction based on Network
1. Accuracy improvement
1.1 BN(Batch Normalization)
BN Operation has the following advantages , The author will BN Operations are added to each convolution layer , Progressiveness before convolution operation BN operation , This operation makes YOLO Of mAP Promoted 2%, And it can also prevent over fitting of training .

1.2 High resolution training
Improving the input resolution can certainly improve the detection accuracy , But at the same time, the speed also decreased .

1.3 introduce Anchor Mechanism
YOLO in , All target detection bounding boxes (bounding box) After the feature extraction network, the full connected layer is used to directly predict the coordinates ,faster R-CNN yes RPN Network to select anchor boxes,YOLOV2 We use anchor boxes To replace the full company layer .
stay 1.2 High resolution setting of , We set the resolution of the network input image to 448*448, In order to obtain the odd position feature map , We changed the input resolution to 416*416, Why? 416*416 That's the number , Because we calculate the number of pooling layers in the feature extraction network , The number of downsampling from input to output is 32, That is, the input must be 32 Integer multiple , This is the size of the output feature map 13*13.
adopt anchor Mechanism , We need to predict more than 1000 box, and YOLO You only need to predict 98 individual box. How to calculate here ? Last article 《YOLOV1 Study and summarize 》 We learned every one in Grid cell Need to predict two box, Divide the whole image into 7*7 After grid , The total box Namely 7*7*2 Of box. adopt anchor The mechanism will reduce the recall rate from 81% Promoted to 88%, It can be said to be a huge leap .

1.4 Scale clustering
When we want to use anchor Mechanism , There must be two problems ,anchor How to choose ? Choose a few ? Manual selection , This increases the workload of model training . Although through a large number of iterations , default anchor Value can also meet the needs of the task , But if we choose the right anchor value , It is certain to improve the efficiency and accuracy of training . In order to calculate automatically Anchor, We use clustering algorithm K-means, Through multiple K Value testing , Found that when K be equal to 5 when , A good balance can be achieved between computational complexity and recall .
stay anchor When the quantity is determined , Several numerical values are also tested by clustering algorithm , Found that when anchor The value is 9 On average IOU Value ratio anchor=5 Time is much higher ,so take anchor The number of is set to 9.

1.5 Location prediction
The author first explains RPN Network prediction box The reason for the instability , stay RPN In the network , The calculation formula of the central coordinate of the prediction box is as follows :

(x,y) Predictive box Central coordinates ,tx,ty For horizontal and vertical offset parameters ,wa and ha by anchor Frame and height of ,(xa,ya) by anchor The central coordinates of . there tx and ty There's no scope , When tx=-1 when , The central coordinate of the prediction box becomes negative . Therefore, using this mechanism may lead to model instability , This requires the offset value tx,ty To limit .
YOLOV2 in , We use the activation function to limit the offset value , Yes, the offset value is limited to 0 To 1 Between , This will ensure the stability of the model .YOLOV2 The target detection bounding box of is calculated as follows :、

among ,
Is the activation function ,
,
by grid cell The offset value relative to the upper left corner coordinate of the whole image ,
,
by anchor The width and height of the frame ,
,
Is the width and height of the prediction box ,
,
Is the central coordinate of the prediction box , The diagram of each parameter is shown in the figure below :

1.6 Multiscale training
Multi scale training mechanism , In the process of training , Change the input size after a certain number of iterations , Each pass 10 The second iteration randomly selects an input size for training , Because the input must be 32 Multiple , Therefore, the size entered is (320,352,384...608) Choose from these numbers . Although the input accuracy of low resolution is poor , But there is faster training and reasoning , High resolution input has better accuracy , Such a mechanism makes YOLOV2 Better balance between accuracy and speed .
2. Speed up
2.1 Put forward based on darknet19 Feature extraction based on Network
Used most 3*3 Convolution kernel operation to extract features , The network architecture is as follows :

Compared with VGG16, The calculation amount of one reasoning operation is only 50.58 Billion times floating point , and VGG16 The amount of calculation is 300 Billion times floating point .
边栏推荐
- Virtual memory technology sharing
- phpstudy设置项目可以由局域网的其他电脑可以访问
- . Net program configuration file operation (INI, CFG, config)
- Example of joint use of ros+pytoch (semantic segmentation)
- 论文笔记 VSALM 文献综述《A Comprehensive Survey of Visual SLAM Algorithms》
- 【无标题】5 自用历程
- How to scan when Canon c3120l is a network shared printer
- Kubesphere - build Nacos cluster
- Oauth2.0 - explanation of simplified mode, password mode and client mode
- 2022 CISP-PTE(三)命令执行
猜你喜欢

Kubernetes notes (IV) kubernetes network

剖析虚幻渲染体系(16)- 图形驱动的秘密

Chapter 8. MapReduce production experience

ssh链接远程服务器 及 远程图形化界面的本地显示

Use abp Zero builds a third-party login module (I): Principles

Local rviz call and display of remote rostopic

tabbar的设置

Creating postgre enterprise database by ArcGIS

Kubernetes notes (I) kubernetes cluster architecture

Oauth2.0 - Introduction and use and explanation of authorization code mode
随机推荐
PHP用ENV获取文件参数的时候拿到的是字符串
【无标题】5 自用历程
Numerical method for solving optimal control problem (I) -- gradient method
CKA certification notes - CKA certification experience post
Naive Bayes in machine learning
Apifix installation
Cesium entity(entities) 实体删除方法
Use selenium to climb the annual box office of Yien
Simple understanding of ThreadLocal
In depth learning
scroll-view指定滚动元素的起始位置
Interesting research on mouse pointer interaction
从 Amazon Aurora 迁移数据到 TiDB
Leetcode problem solving summary, constantly updating!
Project summary --2 (basic use of jsup)
学习笔记 -- k-d tree 和 ikd-Tree 原理及对比
Various usages of MySQL backup database to create table select and how many days are left
Shell conditional statement
MySQL帶二進制的庫錶導出導入
Mysql