当前位置:网站首页>[design tutorial] yolov7 target detection network interpretation
[design tutorial] yolov7 target detection network interpretation
2022-07-27 20:42:00 【DanCheng-studio】
0 Preface
The world is changing too fast ,YOLOv6 It's not ready yet YOLOv7 came , If a student wants to use the latest technology in his Bi design project , Take a look at this article by the senior , The seniors will give you a simple interpretation yolov7, The aim is to yolov7 Have a basic understanding .
from 2015 Year of YOLOV1,2016 year YOLOV2,2018 Year of YOLOV3, To 2020 Year of YOLOV4、 YOLOV5, And the recent emergence of YOLOV6 and YOLOV7 so to speak YOLO The series witnessed the evolution of target detection in the era of deep learning . about YOLO Basic knowledge and YOLOV1 To YOLOV5 You can go to see Da Bai YOLO series , This paper mainly focuses on YOLOV7 The network structure of , It is convenient for everyone to feel intuitively .
🧿 Topic selection guidance , Project sharing :
1 yolov7 Overall structure

Let's look at it as a whole YOLOV7, First, input the picture resize by 640x640 size , Input to backbone In the network , Then by head The three layers of network output are different size The size of feature map, after Rep and conv Output forecast results , Here we use coco As an example , Output is 80 Categories , Then each output (x ,y, w, h, o) Coordinate position and background ,3 It means anchor Number , Therefore, the output of each layer is (80+5)x3 = 255 And ride on feature map The size of is the final output .
2 Key points - backbone
YOLOV7 Of backbone As shown in the figure below

All in all 50 layer , I marked the key layers with black numbers in the figure above . The first is through 4 Convolution layer , Here's the picture ,CBS Mainly Conv + BN + SiLU constitute , I use different colors to represent different size and stride, Such as (3, 2) Indicates that the convolution kernel size is 3 , In steps of 2. stay config The configuration in is shown in the figure .

after 4 individual CBS after , The feature map changes to 160 * 160 * 128 size . Then it will be proposed in the paper ELAN modular ,ELAN By multiple CBS constitute , Its input and output feature size remains unchanged , The number of channels is in the first two CBS There will be changes , The following input channels are consistent with the output channels , After the last CBS Output as the required channel .



MP layer Mainly divided into Maxpool and CBS , among MP1 and MP2 It is mainly the ratio change of the number of channels .

backbone That's all for the basic components of , Let's look at it as a whole backbone, after 4 individual CBS after , Access, for example, a ELAN , Then there are three MP + ELAN Output , The corresponding is C3/C4/C5 Output , The sizes are 80 * 80 * 512 , 40 * 40 * 1024, 20 * 20 * 1024. every last MP from 5 layer , ELAN Yes 8 layer , So the whole thing backbone The number of layers is 4 + 8 + 13 * 3 = 51 layer , from 0 At the beginning , The last layer is the 50 layer .
Key points - head



YOLOV7 head It's really just a pafpn Structure , And before YOLOV4,YOLOV5 equally . First , about backbone The last output 32 Double down sampling characteristic diagram C5, And then pass by SPPCSP, The number of channels ranges from 1024 Turn into 512. First according to top down and C4、C3 The fusion , obtain P3、P4 and P5; Press again bottom-up Go and P4、P5 Do fusion . It's basically the same as YOLOV5 It's the same , The difference is that YOLOV5 Medium CSP Module changed to ELAN-H modular , At the same time, the down sampling becomes MP2 layer .
ELAN-H The module is named by myself , It and backbone Medium ELAN The slight difference is cat The number of different .

3 Training

It's a bit of a pit , If you want to use a larger pre training model , Need to use train_aux.py Training , Otherwise, the effect is very bad

🧿 Topic selection guidance , Project sharing :
4 Use effect
Silky !

边栏推荐
- ES6--拓展运算符运用
- Redis hash structure command
- A recently summarized universal violent cracking method
- I'm also drunk. Eureka delayed registration and this pit
- 海康设备接入EasyCVR,出现告警信息缺失且不同步该如何解决?
- Express: search product API by keyword
- You can understand it at a glance, eslint
- Anfulai embedded weekly report no. 275: 2022.07.18--2022.07.24
- Mongodb learning notes: bson structure analysis
- A layered management method of application layer and hardware layer in embedded system
猜你喜欢

access control

如何监控NVIDIA Jetson的的运行状态和使用情况

Redis queue, RDB learning

PyQt5快速开发与实战 4.7 QSpinBox(计数器) and 4.8 QSlider(滑动条)

Swiftui view onReceive method receives "redundant" event resolution

Redis thing learning

Knowledge dry goods: basic storage service novice Experience Camp

站在巨人肩膀上学习,京东爆款架构师成长手册首发

How bad can a programmer be?

【数据集显示标注】VOC文件结构+数据集标注可视化+代码实现
随机推荐
Why do we need third-party payment?
API for obtaining the latest raw data of Taobao app product details
Interviewer: what is the abstract factory model?
Two years after its release, the price increased by $100, and the reverse growth of meta Quest 2
Get wechat product details API
【深度学习】视频分类技术整理
Illustration leetcode - 592. Fraction addition and subtraction (difficulty: medium)
Pyqt5 rapid development and practice 4.5 button controls and 4.6 qcombobox (drop-down list box)
Clickhouse 实现 MaterializedPostgreSQL
[rctf2015]easysql-1 | SQL injection
Understand the wonderful use of dowanward API, and easily grasp kubernetes environment variables
C language -- array
人家这才叫软件测试工程师,你那只是混口饭吃(附HR面试宝典)
PyQt5快速开发与实战 4.7 QSpinBox(计数器) and 4.8 QSlider(滑动条)
图解LeetCode——592. 分数加减运算(难度:中等)
[rctf2015]easysql-1 | SQL injection
IE11 下载doc pdf等文件的方法
antdv: Each record in table should have a unique `key` prop,or set `rowKey` to an unique primary key
海康设备接入EasyCVR,出现告警信息缺失且不同步该如何解决?
Clickhouse implements materializedpostgresql