当前位置:网站首页>[target detection] yolov5, the shoulder of target detection (detailed principle + Training Guide)
[target detection] yolov5, the shoulder of target detection (detailed principle + Training Guide)
2022-07-01 07:08:00 【Wei Baohang】
List of articles
1.YOLO Input end

1.1 Mosaic Data to enhance
Yolov5 And... Are used at the input of Yolov4 Same Mosaic The way data is enhanced .
Mosaic The author of the data enhancement proposal is also from Yolov5 Team members , however , Random scaling 、 Random cutting 、 Random layout of the way to splice , The detection effect of small targets is still very good .
- 4 Picture mosaic
- Random scaling
- Random cutting
- Random arrangement
Algorithm advantages :
- Rich data set
- Reduce GPU Calculation
1.2 Adaptive anchor frame calculation
stay Yolo In the algorithm, , For different data sets , There will be an anchor frame with initial set length and width .
In network training , The network outputs the prediction box based on the initial anchor box , And then with the real box groundtruth compare , Calculate the difference between the two , Then reverse update , Iterative network parameters .
Step1: Read all pictures in the training set w、h And the detection frame w、h
Step2: Correct the read coordinates to absolute coordinates
Step3: Use Kmeans The algorithm clusters all the detection frames in the training set , obtain k individual anchors
Step4: Through genetic algorithm to get anchors To mutate , If the effect of mutation is good, keep it , Otherwise, skip
Step5: Will ultimately get the best anchors Return by area
1.3 Adaptive image scaling
In common target detection algorithms , Different pictures have different lengths and widths , Therefore, the common way is to uniformly scale the original image to a standard size , Then send it to the detection network .
letterbox Adaptive image scaling technology tries to maintain the aspect ratio , The missing parts shall be filled with gray edges to reach the fixed size .
2.YOLO Overall architecture

2.1 BackBone
Mainly for feature extraction , The object information in the image is extracted through convolution network , Used for later target detection .
2.1.1 Focus modular
Focus Layer principle and PassThrough Layers are very similar . It uses slicing operations to split high-resolution images into multiple low-resolution images / Characteristics of figure , That is, interlaced sampling + Splicing .
2.1.2 SPP modular
Space Pyramid pooling , It can convert any size feature map into a fixed size feature vector .
2.1.3 CSP_X modular
backbone It's a deeper network , Increasing the residual structure can increase the gradient value of back propagation between layers , Avoid the disappearance of gradients caused by deepening , Thus, finer grained features can be extracted without worrying about network degradation .
2.2 Neck
Mix and combine features , Enhance the robustness of the network , Strengthen the ability of object detection , And pass these characteristics to Head Layer to predict .
2.2.1 FPN

2.2.2 PAN

2.3 YOLO Output terminal
It mainly carries out the final prediction output .
2.3.1 Bounding Box Loss function
The consistency between the real detection box and the model prediction output box , For back propagation optimization model .
2.3.2 NMS Non maximum suppression
Judge whether the adjacent grids recognize the same object , Eliminate redundant detection boxes .
边栏推荐
- Easynvs cloud management platform function reconfiguration: support adding users, modifying information, etc
- Chinese explanation of common rclone subcommands
- Open source! Wenxin large model Ernie tiny lightweight technology, accurate and fast, full effect
- 热烈祝贺五行和合酒成功挂牌
- How the esp32 deep sleep current is lower than 10uA
- 解决无法读取META-INF.services里面定义的类
- Solve the problem of "unexpected status code 503 service unavailable" when kaniko pushes the image to harbor
- JSP - 分页
- [network planning] (I) hub, bridge, switch, router and other concepts
- Webapck packaging principle -- Analysis of startup process
猜你喜欢
![[matlab] solve nonlinear programming](/img/2e/7a1f520b602b7539be479efb198f6a.png)
[matlab] solve nonlinear programming

Product learning (III) - demand list

ctfhub-端口扫描(SSRF)

Understand esp32 sleep mode and its power consumption
![[Electrical dielectric number] electrical dielectric number and calculation considering HVDC and facts components](/img/7c/2b1d4797f367cced51f36e8a1bb199.png)
[Electrical dielectric number] electrical dielectric number and calculation considering HVDC and facts components

JSP - paging

C语言实现【扫雷游戏】完整版(实现源码)

DC-4靶机

【LINGO】求解二次规划

图像风格迁移 CycleGAN原理
随机推荐
Insufficient free space after clearing expired cache entries - consider increasing the maximum cache space
kdtree(kd树)笔记
WiFi settings for raspberry Pie 4
8 figures | analyze Eureka's first synchronization registry
TDB中多个model情况下使用fuseki查询
rclone 访问web界面
C语言实现【三子棋游戏】(步骤分析和实现源码)
Système de gestion de l'exploitation et de l'entretien, expérience d'exploitation humanisée
DC-4 target
电脑有网络,但所有浏览器网页都打不开,是怎么回事?
Understanding of Turing test and Chinese Room
1286_ Implementation analysis of task priority setting in FreeRTOS
(I) apple has open source, but so what?
go-etcd
Servlet 和 JSP 中的分页
开源了!文心大模型ERNIE-Tiny轻量化技术,又准又快,效果全开
C# Newtonsoft.Json中JObject的使用
We found a huge hole in MySQL: do not judge the number of rows affected by update!!!
Is the account opening of GF Securities safe and reliable? How to open GF Securities Account
K8s set up redis cluster