当前位置:网站首页>Yolov1 learning notes
Yolov1 learning notes
2022-07-03 06:25:00 【Happy breeder】
Preface : Recently I read a lot about yolo The article , In fact, many articles are really good , Basically, Xiaobai can understand what he wrote , But if you watch too much, you will find , If you want to understand some contents in depth, you must also look at the original English paper , People will add their subjective understanding to what they write more or less , So if you have time, you'd better take a look at the original paper , Understanding the principle of algorithm can make you more confident when doing engineering projects , Higher R & D efficiency .
yolov1 The paper is 2016 year 9 Released on , The inventor and author was a graduate student at that time , I have to lament his talent . And the author will open source all the code , I also admire the pattern of the author , At first, I thought that the developers of dark net and yolo The designer of is not the same , Now I found the framework and yolo The author is the same person , Not only does the algorithm do well , Even the level of software design is so high , pure c Develop dark network framework , Yes, yes .
Catalog
1. Whole graph regression logic
3. Some operating instructions
The author in yolov1 This paper introduces ,yolo Compared with other target detection, the optimization is mainly manifested in the following three points :
(1) The problem of target detection is regarded as a regression problem of detection ;
This is a yolo The main reason for the speed ;
(2)yolo The algorithm takes the whole graph as an input when predicting , Different from sliding window object detection and R-CNN;
This is the main reason to improve the accuracy of target detection ;
(3) Good versatility ;
This feeling can't be an advantage , current faster R-cnn And SSD Should also have ;
1. Whole graph regression logic
The picture above is from yolo Pictures in the original paper , The author divides the whole image into S*S Grid (s) , Each grid predicts B The rectangle of , And every such rectangle should be predicted C Class target . Then the probability of a rectangular box predicting that the rectangular box contains a target can be expressed in the following form :
IOU(intersection over union) That is to say, cross and compare , The intersection ratio between the prediction box and the real box . For the sake of understanding , The explanation of the whole graph regression in the original paper is posted .
That is, the regression of a whole graph needs to be calculated tensor number , Suppose the whole graph is divided into 7*7 Grid size of , And each grid needs to predict two rectangular boxes , Each rectangular box needs 5 The parameters represent , Respectively x,y,w,h And confidence .(x,y) It's coordinates , Is the central coordinate of the rectangular box , The coordinates here are those relative to the grid , Coordinates with grid as reference , Rather than the coordinates of the whole as a reference , The width and height are relative to the prediction of the whole picture . So in the end , The output of the whole network tensor yes 7*7*30, Then, the probability and coordinate position of the final class are output by the full connection layer . Whole yolo The network is shown in the figure below , It includes 24 Two convolution layers and two fully connected layers .
In the paper, the author also mentioned that it can further speed up the reasoning , Become fast yolo, That is what we see in the dark net framework later yolo-tiny The Internet , It's just 9 Convolution layers .
2. Loss function
Loss function design , The author of the object in the grid (grid cell) Neutralization does not set different weights in the mesh , When the target is in the grid , Great power , Set to 5, When not in the mesh, the weight is small , Set to 0.5, This is conducive to the stability of the model , Otherwise, if both are treated equally , Set the same weight , So when the target is not in the grid , At this time, the confidence is 0, If the proportion of the weight is set to be large , It can easily lead to the instability of the model .
among ,=5,
=0.5
3. Some operating instructions
3.1 Pre training model :
Some articles have shown that the muscle augmenting convolution layer and the whole continuous layer can improve the performance of the model , The author uses three kinds of former 20 A convolution layer is used to train the pre training model , The following four convoluted layers and two fully connected layers are initialized randomly .
3.2 leaky Operation function
Well, that's all for today's study , If you have new knowledge, please update it .
Murmur murmur : If you can't describe a thing in simple language , That means you don't understand .
边栏推荐
- Request weather interface format, automation
- Selenium - 改变窗口大小,不同机型呈现的宽高长度会不一样
- Click cesium to obtain three-dimensional coordinates (longitude, latitude and elevation)
- Cesium 点击获取模型表面经纬度高程坐标(三维坐标)
- Fluentd facile à utiliser avec le marché des plug - ins rainbond pour une collecte de journaux plus rapide
- Redis cluster creation, capacity expansion and capacity reduction
- Kubernetes notes (VII) kuberetes scheduling
- Kubernetes notes (II) pod usage notes
- Cesium entity (entities) entity deletion method
- Page text acquisition
猜你喜欢
Project summary --01 (addition, deletion, modification and query of interfaces; use of multithreading)
Time format record
Kubesphere - set up redis cluster
Oauth2.0 - using JWT to replace token and JWT content enhancement
ruoyi接口权限校验
Zhiniu stock project -- 04
Example of joint use of ros+pytoch (semantic segmentation)
Kubernetes notes (VII) kuberetes scheduling
远端rostopic的本地rviz调用及显示
论文笔记 VSALM 文献综述《A Comprehensive Survey of Visual SLAM Algorithms》
随机推荐
Derivation of variance iteration formula
Cesium entity(entities) 实体删除方法
CKA certification notes - CKA certification experience post
ODL framework project construction trial -demo
【C#/VB.NET】 将PDF转为SVG/Image, SVG/Image转PDF
Oauth2.0 - using JWT to replace token and JWT content enhancement
.NET程序配置文件操作(ini,cfg,config)
Simple understanding of ThreadLocal
JMeter linked database
Oracle database synonym creation
Luogu problem list: [mathematics 1] basic mathematics problems
Kubernetes notes (I) kubernetes cluster architecture
Interface test weather API
Numerical method for solving optimal control problem (I) -- gradient method
Cesium Click to obtain the longitude and latitude elevation coordinates (3D coordinates) of the model surface
phpstudy设置项目可以由局域网的其他电脑可以访问
远端rostopic的本地rviz调用及显示
有意思的鼠標指針交互探究
Creating postgre enterprise database by ArcGIS
Kubernetes notes (VI) kubernetes storage