当前位置:网站首页>Yolov5 Lite: experiment and thinking of repovgg re parameterization on the industrial landing of Yolo

Yolov5 Lite: experiment and thinking of repovgg re parameterization on the industrial landing of Yolo

2022-07-08 02:20:00 pogg_

 Insert picture description here
QQ Communication group :993965802

The copyright of this article belongs to GiantPandaCV, Please do not reprint without permission

This experiment mainly draws lessons from repvgg The idea of re parameterization , The original 3×3conv Replace with Repvgg Block, For the original YOLO Model rising point .

Preface : Once before shufflenetv2 And yolov5 The combination of , The purpose is to adapt arm Series of chips , Give Way yolov5 It can also achieve real-time performance on the end-side equipment . But in gpu perhaps npu Has also been trying to experiment , The purpose of such experiments is clear , It's not demanding , Mainly hope yolov5 It can speed up while maintaining the original accuracy .

experiment

This time the model is mainly for reference repvgg The idea of re parameterization , The original 3×3conv Replace with repvgg block, In the process of training , Using a multi branch model , And when deploying and reasoning , It uses the model of converting multiple branches into one path .
 Insert picture description here
analogy repvgg The views expressed in the paper , there baseline The choice is yolov5s, Yes yolov5s Of 3×3conv refactoring , Separate one 1×1conv The side branch of .
 Insert picture description here
In reasoning , Fuse collateral branches into 3×3 In convolution of , The model at this time is the same as the original yolov5s There is no difference in the model
 Insert picture description here
Before that , Use the most direct way to yolov5s Make magic changes , That is, directly replace backbone The way , But it is found that the parameter quantity and FLOPs Higher , The reproduction accuracy is closest to yolov5s Yes. repvgg-A1, as follows backbone Replace with A1 Of yolov5s:
 Insert picture description here
Then , In order to suppress Flops And the increase of parameters , Take use of repvgg block Replace yolov5s Of 3×3conv The way .
 Insert picture description here
The difference between the two Flops The ratio and parameter ratio are about 2.75 and 1.85.

performance

Through ablation experiments , It is concluded that the yolov5s And fusion repvgg block Of yolov5s The performance differences are as follows :
 Insert picture description here
Evaluated here yolov5s stay map The indicators are different from the official website , After two tests, it was 55.8 and 35.8, But the test results are similar to https://github.com/midasklr/yolov5prune as well as Issue #3168 · ultralytics/yolov5 Almost the same . Use repvgg block restructure yolov5s Of 3×3 Convolution , stay [email protected] and @.5:.95 The indicators can be improved by at least one point .

After training repyolov5s Need to carry out convert, Will the collateral 1×1conv To merge , Otherwise, the reasoning will be better than the original yolov5s slow 20%.

Use convert.py Yes repvgg block Re parameterize , The main codes are as follows , Reference resources https://github.com/DingXiaoH/RepVGG/blob/main/repvgg.py:

# --------------------------repvgg refuse---------------------------------
    def reparam conv(self):  # fuse model Conv2d() + BatchNorm2d() layers
         """ :param rbr_dense: 3×3 Convolution module  :param rbr_1x1: 1×1 Collateral branch inception :param _pad_1x1_to_3x3_tensor:  Yes 1×1 Of inception To expand  :return: """
        print('Reparam and Fusing Block... ')
        for m in self.model.modules():
            # print(m)
            if type(m) is RepVGGBlock:
                if hasattr(m, 'rbr_1x1'):
                    # print(m)
                    kernel, bias = m.get_equivalent_kernel_bias()
                    conv_reparam = nn.Conv2d(in_channels=m.rbr_dense.conv.in_channels,
                                                 out_channels=m.rbr_dense.conv.out_channels,
                                                 kernel_size=m.rbr_dense.conv.kernel_size,
                                                 stride=m.rbr_dense.conv.stride,
                                                 padding=m.rbr_dense.conv.padding, dilation=m.rbr_dense.conv.dilation,
                                                 groups=m.rbr_dense.conv.groups, bias=True)
                    conv_reparam.weight.data = kernel
                    conv_reparam.bias.data = bias
                    for para in self.parameters():
                        para.detach_()
                    m.rbr_dense = conv_reparam
                    # m.__delattr__('rbr_dense')
                    m.__delattr__('rbr_1x1')
                    m.deploy = True
                    m.forward = m.fusevggforward  # update forward
                continue
                # print(m)
            if type(m) is Conv and hasattr(m, 'bn'):
                # print(m)
                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
                delattr(m, 'bn')  # remove batchnorm
                m.forward = m.fuseforward  # update forward
        self.info()
        return self

We can do it by calling onnx Model for convert The models before and after are visualized :
 Insert picture description here

Reasoning

map Indicators are only part of the reference , And part of it is about reparam and fuse After yolov5s Will it be because repvgg block Slow down due to implantation . In theory ,reparam After repvgg block Equivalent to 3×3 Convolution , However, the convolution is better than ordinary 3×3 Convolution is more compact .

After three tests coco val2017 After the dataset (5000 Zhang and single sheet reasoning ), obtain repyolov5s The estimated time of the leaflet is 14/14/14(ms)、yolov5s by 16/16/16(ms), Here I discussed with white God , White God believes that there may be test errors in the extremely close reasoning time between the two , Without any persuasion .

But to be sure convert After yolov5s Reasoning speed will not be because repvgg block Implant and slow down . In order to avoid contingency and measurement error , It's used here 500/5000/64115/118287 This picture is tested for reasoning :
 Insert picture description here
The test results are as follows :
 Insert picture description here

test

The detection effect should also be an indicator of concern , Use the above two models , Ensure that other parameters are consistent , Detect the picture , The effect is as follows :
 Insert picture description here
 Insert picture description here

summary

Use repvgg block Yes yolov5s Improvement , Through ablation experiments , Sum up the following points :

  • The fusion repvgg block Of yolov5s It can rise points on both large and small targets ; Use fusion repvgg
  • block and leakyrelu Of yolov5s Biyuan yolov5s stay map It's lower 0.5 percentage , But the speed can be improved 15%( Mainly replaced Silu What functions do );
  • If you don't do convert, Personally, this fusion experiment is meaningless , The side branches will seriously affect the running speed of the model ;
  • C3 Block and Repvgg Block stay cpu Low cost performance , stay gpu and npu The maximum gain can only be achieved by using on
  • Use reparameterized yolov5 There is a price , The cost and loss are all in training , It will occupy more graphics card about 5-10% Explicit memory of , Training time will also increase
  • Consider using repvgg block Yes yolov3-spp and yolov4 Of 3×3 Convolution for reconstruction

The code and pre training model will be put on my warehouse later :

https://github.com/ppogg/YOLOv5-Lite

 Insert picture description here

原网站

版权声明
本文为[pogg_]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202130540225616.html