当前位置:网站首页>Yolov1 learning notes
Yolov1 learning notes
2022-07-03 06:25:00 【Happy breeder】
Preface : Recently I read a lot about yolo The article , In fact, many articles are really good , Basically, Xiaobai can understand what he wrote , But if you watch too much, you will find , If you want to understand some contents in depth, you must also look at the original English paper , People will add their subjective understanding to what they write more or less , So if you have time, you'd better take a look at the original paper , Understanding the principle of algorithm can make you more confident when doing engineering projects , Higher R & D efficiency .
yolov1 The paper is 2016 year 9 Released on , The inventor and author was a graduate student at that time , I have to lament his talent . And the author will open source all the code , I also admire the pattern of the author , At first, I thought that the developers of dark net and yolo The designer of is not the same , Now I found the framework and yolo The author is the same person , Not only does the algorithm do well , Even the level of software design is so high , pure c Develop dark network framework , Yes, yes .
Catalog
1. Whole graph regression logic
3. Some operating instructions
The author in yolov1 This paper introduces ,yolo Compared with other target detection, the optimization is mainly manifested in the following three points :
(1) The problem of target detection is regarded as a regression problem of detection ;
This is a yolo The main reason for the speed ;
(2)yolo The algorithm takes the whole graph as an input when predicting , Different from sliding window object detection and R-CNN;
This is the main reason to improve the accuracy of target detection ;
(3) Good versatility ;
This feeling can't be an advantage , current faster R-cnn And SSD Should also have ;
1. Whole graph regression logic

The picture above is from yolo Pictures in the original paper , The author divides the whole image into S*S Grid (s) , Each grid predicts B The rectangle of , And every such rectangle should be predicted C Class target . Then the probability of a rectangular box predicting that the rectangular box contains a target can be expressed in the following form :

IOU(intersection over union) That is to say, cross and compare , The intersection ratio between the prediction box and the real box . For the sake of understanding , The explanation of the whole graph regression in the original paper is posted .

That is, the regression of a whole graph needs to be calculated tensor number , Suppose the whole graph is divided into 7*7 Grid size of , And each grid needs to predict two rectangular boxes , Each rectangular box needs 5 The parameters represent , Respectively x,y,w,h And confidence .(x,y) It's coordinates , Is the central coordinate of the rectangular box , The coordinates here are those relative to the grid , Coordinates with grid as reference , Rather than the coordinates of the whole as a reference , The width and height are relative to the prediction of the whole picture . So in the end , The output of the whole network tensor yes 7*7*30, Then, the probability and coordinate position of the final class are output by the full connection layer . Whole yolo The network is shown in the figure below , It includes 24 Two convolution layers and two fully connected layers .

In the paper, the author also mentioned that it can further speed up the reasoning , Become fast yolo, That is what we see in the dark net framework later yolo-tiny The Internet , It's just 9 Convolution layers .
2. Loss function
Loss function design , The author of the object in the grid (grid cell) Neutralization does not set different weights in the mesh , When the target is in the grid , Great power , Set to 5, When not in the mesh, the weight is small , Set to 0.5, This is conducive to the stability of the model , Otherwise, if both are treated equally , Set the same weight , So when the target is not in the grid , At this time, the confidence is 0, If the proportion of the weight is set to be large , It can easily lead to the instability of the model .

among ,
=5,
=0.5
3. Some operating instructions
3.1 Pre training model :
Some articles have shown that the muscle augmenting convolution layer and the whole continuous layer can improve the performance of the model , The author uses three kinds of former 20 A convolution layer is used to train the pre training model , The following four convoluted layers and two fully connected layers are initialized randomly .
3.2 leaky Operation function

Well, that's all for today's study , If you have new knowledge, please update it .
Murmur murmur : If you can't describe a thing in simple language , That means you don't understand .
边栏推荐
- When PHP uses env to obtain file parameters, it gets strings
- 【系统设计】邻近服务
- About the difference between count (1), count (*), and count (column name)
- Leetcode problem solving summary, constantly updating!
- ThreadLocal的简单理解
- ODL framework project construction trial -demo
- .NET程序配置文件操作(ini,cfg,config)
- “我为开源打榜狂”第一周榜单公布,160位开发者上榜
- Common interview questions
- 10万奖金被瓜分,快来认识这位上榜者里的“乘风破浪的姐姐”
猜你喜欢

YOLOV1学习笔记

. Net program configuration file operation (INI, CFG, config)

How to scan when Canon c3120l is a network shared printer

ssh链接远程服务器 及 远程图形化界面的本地显示

第8章、MapReduce 生产经验

Kubernetes notes (I) kubernetes cluster architecture

Merge and migrate data from small data volume, sub database and sub table Mysql to tidb

Kubernetes notes (VII) kuberetes scheduling

Selenium - 改变窗口大小,不同机型呈现的宽高长度会不一样

10万奖金被瓜分,快来认识这位上榜者里的“乘风破浪的姐姐”
随机推荐
opencv鼠标键盘事件
深入解析kubernetes controller-runtime
Leetcode solution - 02 Add Two Numbers
认识弹性盒子flex
SSH link remote server and local display of remote graphical interface
Use selenium to climb the annual box office of Yien
Mysql database table export and import with binary
Kubesphere - build Nacos cluster
YOLOV2学习与总结
Fluentd is easy to use. Combined with the rainbow plug-in market, log collection is faster
Migrate data from Amazon aurora to tidb
Use abp Zero builds a third-party login module (I): Principles
Floating menu operation
方差迭代公式推导
Selenium - 改变窗口大小,不同机型呈现的宽高长度会不一样
Kubesphere - Multi tenant management
Characteristics and isolation level of database
Kubernetes notes (10) kubernetes Monitoring & debugging
YOLOV1学习笔记
ruoyi接口权限校验