当前位置:网站首页>[design tutorial] yolov7 target detection network interpretation
[design tutorial] yolov7 target detection network interpretation
2022-07-27 20:42:00 【DanCheng-studio】
0 Preface
The world is changing too fast ,YOLOv6 It's not ready yet YOLOv7 came , If a student wants to use the latest technology in his Bi design project , Take a look at this article by the senior , The seniors will give you a simple interpretation yolov7, The aim is to yolov7 Have a basic understanding .
from 2015 Year of YOLOV1,2016 year YOLOV2,2018 Year of YOLOV3, To 2020 Year of YOLOV4、 YOLOV5, And the recent emergence of YOLOV6 and YOLOV7 so to speak YOLO The series witnessed the evolution of target detection in the era of deep learning . about YOLO Basic knowledge and YOLOV1 To YOLOV5 You can go to see Da Bai YOLO series , This paper mainly focuses on YOLOV7 The network structure of , It is convenient for everyone to feel intuitively .
🧿 Topic selection guidance , Project sharing :
1 yolov7 Overall structure

Let's look at it as a whole YOLOV7, First, input the picture resize by 640x640 size , Input to backbone In the network , Then by head The three layers of network output are different size The size of feature map, after Rep and conv Output forecast results , Here we use coco As an example , Output is 80 Categories , Then each output (x ,y, w, h, o) Coordinate position and background ,3 It means anchor Number , Therefore, the output of each layer is (80+5)x3 = 255 And ride on feature map The size of is the final output .
2 Key points - backbone
YOLOV7 Of backbone As shown in the figure below

All in all 50 layer , I marked the key layers with black numbers in the figure above . The first is through 4 Convolution layer , Here's the picture ,CBS Mainly Conv + BN + SiLU constitute , I use different colors to represent different size and stride, Such as (3, 2) Indicates that the convolution kernel size is 3 , In steps of 2. stay config The configuration in is shown in the figure .

after 4 individual CBS after , The feature map changes to 160 * 160 * 128 size . Then it will be proposed in the paper ELAN modular ,ELAN By multiple CBS constitute , Its input and output feature size remains unchanged , The number of channels is in the first two CBS There will be changes , The following input channels are consistent with the output channels , After the last CBS Output as the required channel .



MP layer Mainly divided into Maxpool and CBS , among MP1 and MP2 It is mainly the ratio change of the number of channels .

backbone That's all for the basic components of , Let's look at it as a whole backbone, after 4 individual CBS after , Access, for example, a ELAN , Then there are three MP + ELAN Output , The corresponding is C3/C4/C5 Output , The sizes are 80 * 80 * 512 , 40 * 40 * 1024, 20 * 20 * 1024. every last MP from 5 layer , ELAN Yes 8 layer , So the whole thing backbone The number of layers is 4 + 8 + 13 * 3 = 51 layer , from 0 At the beginning , The last layer is the 50 layer .
Key points - head



YOLOV7 head It's really just a pafpn Structure , And before YOLOV4,YOLOV5 equally . First , about backbone The last output 32 Double down sampling characteristic diagram C5, And then pass by SPPCSP, The number of channels ranges from 1024 Turn into 512. First according to top down and C4、C3 The fusion , obtain P3、P4 and P5; Press again bottom-up Go and P4、P5 Do fusion . It's basically the same as YOLOV5 It's the same , The difference is that YOLOV5 Medium CSP Module changed to ELAN-H modular , At the same time, the down sampling becomes MP2 layer .
ELAN-H The module is named by myself , It and backbone Medium ELAN The slight difference is cat The number of different .

3 Training

It's a bit of a pit , If you want to use a larger pre training model , Need to use train_aux.py Training , Otherwise, the effect is very bad

🧿 Topic selection guidance , Project sharing :
4 Use effect
Silky !

边栏推荐
- Introduction to zepto
- Redis basic understanding, five basic data types
- 【Map 集合】
- MySQL learning record (III) multi table query, sub query, paging query, case statement, single line function
- ES6--拓展运算符运用
- How to optimize the open source community experience through developer metrics
- slf4j简介说明
- Session attack
- Koin simple to use
- [benefit activity] stack a buff for your code! Click "tea" to receive the gift
猜你喜欢

EasyCVR平台关闭录像为何还会有TS切片文件生成?

A new UI testing method: visual perception test
![[deep learning] video classification technology sorting](/img/bf/422d4ef342199966bbdaae06977699.png)
[deep learning] video classification technology sorting

Idea: solve the problem of code without prompt

Can software testing be learned in 2022? Don't learn, software testing positions are saturated
![Leetcode:1498. Number of subsequences that meet the conditions [sort + bisection + power hash table]](/img/a9/de68e8affcb6b84e82cf344e7254e3.png)
Leetcode:1498. Number of subsequences that meet the conditions [sort + bisection + power hash table]

Lennix Lai, OKx financial market director: Web3 is a revolution

Under the epidemic, I left my job for a year, and my income increased 10 times

Check the internship salary of Internet companies: with it, you can also enter the factory

站在巨人肩膀上学习,京东爆款架构师成长手册首发
随机推荐
[Alibaba security × ICDM 2022] 200000 bonus pool! The risk commodity inspection competition on the large-scale e-commerce map is in hot registration
Pyqt5 rapid development and practice 4.7 qspinbox (counter) and 4.8 QSlider (slider)
ZJNU 22-07-26 比赛心得
Office automation solution - docuware cloud is a complete solution to migrate applications and processes to the cloud
To share the denoising methods and skills of redshift renderer, you must have a look
shell
做测试, 就得去大厂,内部披露BAT大厂招聘“潜规则”
IE11 下载doc pdf等文件的方法
Adjust the array so that odd numbers all precede even numbers
Redis queue、rdb学习
Redis queue, RDB learning
How to configure log4j in slf4j?
Following Huawei and MediaTek, the mobile phone chip manufacturer announced a donation of 7million yuan to Wuhan
用户和权限限制用户使用资源
Redis 事物学习
In 2019, the global semiconductor market revenue was $418.3 billion, a year-on-year decrease of 11.9%
分享Redshift渲染器的去噪方法技巧,一定要看看
Clickhouse implements materializedpostgresql
数仓搭建——DWD层
PyQt5快速开发与实战 4.3 QLabel and 4.4 文本框类控件