当前位置:网站首页>[target detection] yolov5, the shoulder of target detection (detailed principle + Training Guide)
[target detection] yolov5, the shoulder of target detection (detailed principle + Training Guide)
2022-07-01 07:08:00 【Wei Baohang】
List of articles
1.YOLO Input end

1.1 Mosaic Data to enhance
Yolov5 And... Are used at the input of Yolov4 Same Mosaic The way data is enhanced .
Mosaic The author of the data enhancement proposal is also from Yolov5 Team members , however , Random scaling 、 Random cutting 、 Random layout of the way to splice , The detection effect of small targets is still very good .
- 4 Picture mosaic
- Random scaling
- Random cutting
- Random arrangement
Algorithm advantages :
- Rich data set
- Reduce GPU Calculation
1.2 Adaptive anchor frame calculation
stay Yolo In the algorithm, , For different data sets , There will be an anchor frame with initial set length and width .
In network training , The network outputs the prediction box based on the initial anchor box , And then with the real box groundtruth compare , Calculate the difference between the two , Then reverse update , Iterative network parameters .
Step1: Read all pictures in the training set w、h And the detection frame w、h
Step2: Correct the read coordinates to absolute coordinates
Step3: Use Kmeans The algorithm clusters all the detection frames in the training set , obtain k individual anchors
Step4: Through genetic algorithm to get anchors To mutate , If the effect of mutation is good, keep it , Otherwise, skip
Step5: Will ultimately get the best anchors Return by area
1.3 Adaptive image scaling
In common target detection algorithms , Different pictures have different lengths and widths , Therefore, the common way is to uniformly scale the original image to a standard size , Then send it to the detection network .
letterbox Adaptive image scaling technology tries to maintain the aspect ratio , The missing parts shall be filled with gray edges to reach the fixed size .
2.YOLO Overall architecture

2.1 BackBone
Mainly for feature extraction , The object information in the image is extracted through convolution network , Used for later target detection .
2.1.1 Focus modular
Focus Layer principle and PassThrough Layers are very similar . It uses slicing operations to split high-resolution images into multiple low-resolution images / Characteristics of figure , That is, interlaced sampling + Splicing .
2.1.2 SPP modular
Space Pyramid pooling , It can convert any size feature map into a fixed size feature vector .
2.1.3 CSP_X modular
backbone It's a deeper network , Increasing the residual structure can increase the gradient value of back propagation between layers , Avoid the disappearance of gradients caused by deepening , Thus, finer grained features can be extracted without worrying about network degradation .
2.2 Neck
Mix and combine features , Enhance the robustness of the network , Strengthen the ability of object detection , And pass these characteristics to Head Layer to predict .
2.2.1 FPN

2.2.2 PAN

2.3 YOLO Output terminal
It mainly carries out the final prediction output .
2.3.1 Bounding Box Loss function
The consistency between the real detection box and the model prediction output box , For back propagation optimization model .
2.3.2 NMS Non maximum suppression
Judge whether the adjacent grids recognize the same object , Eliminate redundant detection boxes .
边栏推荐
- How to enter the Internet industry and become a product manager? How to become a product manager without project experience?
- Solution to the problem that objects in unity2021 scene view cannot be directly selected
- Esp32 - ULP coprocessor reading Hall sensor in low power mode
- Kdtree notes
- K8s set up redis cluster
- Will Internet talents be scarce in the future? Which technology directions are popular?
- Paging in servlets and JSPS
- Router 6/ and the difference with router5
- 解决kaniko push镜像到harbor时报错(代理导致):unexpected status code 503 Service Unavailable
- [lingo] find the shortest path problem of undirected graph
猜你喜欢

Image style migration cyclegan principle

DC-4靶机
![[network planning] (I) hub, bridge, switch, router and other concepts](/img/7b/fcef37496517c854ac1dbfb35fa3f4.png)
[network planning] (I) hub, bridge, switch, router and other concepts

Understanding of Turing test and Chinese Room

Operation and maintenance management system, humanized operation experience

ESP32 ESP-IDF ADC监测电池电压(带校正)

Is it suitable for girls to study product manager? What are the advantages?

Système de gestion de l'exploitation et de l'entretien, expérience d'exploitation humanisée

ctfshow-web355,356(SSRF)

Mysql与Redis一致性解决方案
随机推荐
Kdtree notes
Esp32 esp-idf ADC monitors battery voltage (with correction)
Webapck packaging principle -- Analysis of startup process
go-etcd
How to use Alibaba vector font files through CDN
TDB中多个model情况下使用fuseki查询
Is it reliable to open an account on the compass with your mobile phone? Is there any potential safety hazard
rclone常用子命令中文解释
Are there any practical skills for operation and maintenance management
[network planning] (I) hub, bridge, switch, router and other concepts
如何进入互联网行业,成为产品经理?没有项目经验如何转行当上产品经理?
為什麼這麼多人轉行產品經理?產品經理發展前景如何?
Is it suitable for girls to study product manager? What are the advantages?
rclone配置minio及基本操作
Operation and maintenance management system, humanized operation experience
Understanding of Turing test and Chinese Room
Principle of introducing modules into node
【FPGA帧差】基于VmodCAM摄像头的帧差法目标跟踪FPGA实现
Solve the problem of "unexpected status code 503 service unavailable" when kaniko pushes the image to harbor
DC-4 target