当前位置：网站首页>yolov7 innovation point

yolov7 innovation point

2022-08-02 09:57:00 【ffllxx123】

Innovation 1: Extended Efficient Layer Aggregation Network E-ELAN and Composite Model Reduction

Insert picture description here
In the middle of the picture, there is actually a convolution and a siluA bn layer.64 on the right is the number of output channels, 1 is the size of the convolution kernel, and 1 is the stride.
insert image description here
v7 proposes e-elan, in fact, e-elan istwo elan.

Model scaling

insert image description here
From (a) to (b), we observeAs a result, when performing depth scaling on cascade-based models, the output width of the computational block also increases.This phenomenon will cause the input width of subsequent transport layers to increase.Therefore, we propose (c) that when performing model scaling on a cascade-based model, only the depth in the computation block needs to be scaled, and the rest of the transport layer uses the corresponding width scaling.You can make the width unchanged (roughly this reason).

Reparameterized Network

insert image description here
As you can see from the leftmost figure, the residualThere are two types of structures, the middle one is the re-parameterized residual structure. It can be seen that each residual structure only spans a 3x3 convolution, and sometimes the two residual structures are used together, but there is no inference stage.Residual structure, which makes the training accuracy higher and the inference speed faster.
insert image description here
Parameter fusion, that is, 3x3 convolution kernel 1x1 volumeProduct and do nothing These 3 can be fused into a 3X3 convolution.

both left and right are doing nothing, is the identity all the way.Then, for example, the original road has an ordinary 3x3 convolution, then adding this ordinary 3x3 convolution and the convolution on the right side of the above picture one by one can achieve parameter fusion.You get the effect on the far right of the image below.Re-parameterization is achieved.
insert image description here

This repconv is a reparameterized convolution.The author found that the reparameterized convolution plus the residual is not good (d figure),

Innovation point 3 tag matching

Insert picture description here
a picture is a common pyramid model, introduced in v7The b structure is added and the auxiliary head is added. You can see it in the c picture, and calculate the loss of the guide head and the auxiliary head at the same time. As you can see in the d picture, the distributor in the guide head will assist in calculating the loss of the auxiliary head, and then at the same time for the guide head.The loss and the loss of the auxiliary head are optimized by gradient descent.