当前位置:网站首页>Improvement 11 of yolov5: replace backbone network C3 with lightweight network mobilenetv3
Improvement 11 of yolov5: replace backbone network C3 with lightweight network mobilenetv3
2022-07-28 22:49:00 【Artificial Intelligence Algorithm Research Institute】
front said : As the current advanced deep learning target detection algorithm YOLOv5, A large number of trick, But there is still room for improvement , For the detection difficulties in specific application scenarios , There are different ways to improve . Subsequent articles , Focus on YOLOv5 How to improve is introduced in detail , The purpose is to provide their own meager help and reference for those who need innovation in scientific research or friends who need to achieve better results in engineering projects .
solve the problem :YOLOv5 The backbone feature extraction network adopts C3 structure , Bring a large number of parameters , The detection speed is slow , Limited application , In some real application scenarios, such as mobile or embedded devices , Such a large and complex model is difficult to be applied . The first is that the model is too large , Facing the problem of insufficient memory , Second, these scenarios require low latency , In other words, the response speed should be fast , Imagine the pedestrian detection system of self driving cars. What terrible things will happen if the speed is slow? . therefore , Research small and efficient CNN Models are crucial in these scenarios , At least for now , Although the hardware will be faster and faster in the future . This paper attempts to replace the backbone feature extraction network with a lighter MobileNet The Internet , To realize the lightweight of the network model , Balance speed and accuracy .
principle :
Address of thesis :https://arxiv.org/abs/1905.02244.pdf
generation code :https://github.com/LeBron-Jian/DeepLearningNote
MobileNet V3 The related technologies are as follows :
1, use MnasNet Search network structure
2, use V1 The depth is separable
3, use V2 The inverted residual linear bottleneck structure of
4, introduce SE modular
5, New activation function h-swish(x)
6, Two strategies are used in Web Search : Resource constrained NAS and NetAdapt
7, modify V2 The last part reduces the calculation

Fang Law :
Step 1 modify common.py, increase MobileNetV3 modular .
class StemBlock(nn.Module):
def __init__(self, c1, c2, k=3, s=2, p=None, g=1, act=True):
super(StemBlock, self).__init__()
self.stem_1 = Conv(c1, c2, k, s, p, g, act)
self.stem_2a = Conv(c2, c2 // 2, 1, 1, 0)
self.stem_2b = Conv(c2 // 2, c2, 3, 2, 1)
self.stem_2p = nn.MaxPool2d(kernel_size=2, stride=2, ceil_mode=True)
self.stem_3 = Conv(c2 * 2, c2, 1, 1, 0)
def forward(self, x):
stem_1_out = self.stem_1(x)
stem_2a_out = self.stem_2a(stem_1_out)
stem_2b_out = self.stem_2b(stem_2a_out)
stem_2p_out = self.stem_2p(stem_1_out)
out = self.stem_3(torch.cat((stem_2b_out, stem_2p_out), 1))
return out
class h_swish(nn.Module):
def __init__(self, inplace=True):
super(h_swish, self).__init__()
self.sigmoid = h_sigmoid(inplace=inplace)
def forward(self, x):
y = self.sigmoid(x)
return x * y
class SELayer(nn.Module):
def __init__(self, channel, reduction=4):
super(SELayer, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d(1)
self.fc = nn.Sequential(
nn.Linear(channel, channel // reduction),
nn.ReLU(inplace=True),
nn.Linear(channel // reduction, channel),
h_sigmoid()
)
def forward(self, x):
b, c, _, _ = x.size()
y = self.avg_pool(x)
y = y.view(b, c)
y = self.fc(y).view(b, c, 1, 1)
return x * y
class conv_bn_hswish(nn.Module):
"""
This equals to
def conv_3x3_bn(inp, oup, stride):
return nn.Sequential(
nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
nn.BatchNorm2d(oup),
h_swish()
)
"""
def __init__(self, c1, c2, stride):
super(conv_bn_hswish, self).__init__()
self.conv = nn.Conv2d(c1, c2, 3, stride, 1, bias=False)
self.bn = nn.BatchNorm2d(c2)
self.act = h_swish()
def forward(self, x):
return self.act(self.bn(self.conv(x)))
def fuseforward(self, x):
return self.act(self.conv(x))
class MobileNetV3_InvertedResidual(nn.Module):
def __init__(self, inp, oup, hidden_dim, kernel_size, stride, use_se, use_hs):
super(MobileNetV3_InvertedResidual, self).__init__()
assert stride in [1, 2]
self.identity = stride == 1 and inp == oup
if inp == hidden_dim:
self.conv = nn.Sequential(
# dw
nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
bias=False),
nn.BatchNorm2d(hidden_dim),
h_swish() if use_hs else nn.ReLU(inplace=True),
# Squeeze-and-Excite
SELayer(hidden_dim) if use_se else nn.Sequential(),
# Eca_layer(hidden_dim) if use_se else nn.Sequential(),#1.13.2022
# pw-linear
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
)
else:
self.conv = nn.Sequential(
# pw
nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
nn.BatchNorm2d(hidden_dim),
h_swish() if use_hs else nn.ReLU(inplace=True),
# dw
nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
bias=False),
nn.BatchNorm2d(hidden_dim),
# Squeeze-and-Excite
SELayer(hidden_dim) if use_se else nn.Sequential(),
# Eca_layer(hidden_dim) if use_se else nn.Sequential(), # 1.13.2022
h_swish() if use_hs else nn.ReLU(inplace=True),
# pw-linear
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
)
def forward(self, x):
y = self.conv(x)
if self.identity:
return x + y
else:
return yThe second step : take yolo.py Registration module in .
if m in [Conv,MobileNetV3_InvertedResidual,ShuffleNetV2_InvertedResidual, ]:
The third step : modify yaml file
backbone:
# MobileNetV3-large
# [from, number, module, args]
[[-1, 1, conv_bn_hswish, [16, 2]], # 0-p1/2
[-1, 1, MobileNetV3_InvertedResidual, [ 16, 16, 3, 1, 0, 0]], # 1-p1/2
[-1, 1, MobileNetV3_InvertedResidual, [ 24, 64, 3, 2, 0, 0]], # 2-p2/4
[-1, 1, MobileNetV3_InvertedResidual, [ 24, 72, 3, 1, 0, 0]], # 3-p2/4
[-1, 1, MobileNetV3_InvertedResidual, [ 40, 72, 5, 2, 1, 0]], # 4-p3/8
[-1, 1, MobileNetV3_InvertedResidual, [ 40, 120, 5, 1, 1, 0]], # 5-p3/8
[-1, 1, MobileNetV3_InvertedResidual, [ 40, 120, 5, 1, 1, 0]], # 6-p3/8
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 240, 3, 2, 0, 1]], # 7-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 200, 3, 1, 0, 1]], # 8-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 184, 3, 1, 0, 1]], # 9-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 184, 3, 1, 0, 1]], # 10-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [112, 480, 3, 1, 1, 1]], # 11-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [112, 672, 3, 1, 1, 1]], # 12-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [160, 672, 5, 1, 1, 1]], # 13-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [160, 960, 5, 2, 1, 1]], # 14-p5/32 primary 672 Change to the original algorithm 960
[-1, 1, MobileNetV3_InvertedResidual, [160, 960, 5, 1, 1, 1]], # 15-p5/32
]junction fruit : I have done a lot of experiments on multiple data sets , For different data sets, the effect is different ,map Value down , But the size of the weight model decreases , The parameter quantity decreases .
Let me know : The next content will continue to share the sharing of network lightweight methods . Interested friends can pay attention to me , If you have questions, you can leave a message or chat with me in private
PS: The replacement of dry network is not only applicable to improvement YOLOv5, You can also improve others YOLO Network and target detection network , such as YOLOv4、v3 etc. .
Last , I hope I can powder each other , Be a friend , Learn and communicate together .
边栏推荐
- 【三维目标检测】3DSSD(一)
- 20-09-27 the project is migrated to Alibaba toss record (the network card order makes the service unable to connect to DB through haproxy)
- STM32CUBEIDE(10)----ADC在DMA模式下扫描多个通道
- Paper reading vision gnn: an image is worth graph of nodes
- Excel-vba quick start (XIII. Common usage of date)
- Using nodejs to operate MySQL
- 递归和迭代
- LTE小区搜索过程及SCH/BCH设计
- OSV_ q Expected all tensors to be on the same device, but found at least two devices, cuda:0
- Baidu map usage
猜你喜欢
![[reprint] the token token is used in the login scenario](/img/84/77dc2316e2adc380a580e2456c0e59.png)
[reprint] the token token is used in the login scenario

PaddleNLP基于ERNIR3.0文本分类以CAIL2018-SMALL数据集罪名预测任务为例【多标签】
![Draem+sspcab [anomaly detection: block]](/img/97/75ce235c2021b56007eecb82afe4b0.png)
Draem+sspcab [anomaly detection: block]

imx6q gpio复用

STM32 - Communication

Simple es highlight practice

B站713故障后的多活容灾建设|TakinTalks大咖分享

STM32 - reset and clock control (cubemx for clock configuration)

Qt+FFmpeg环境搭建
![Mspba [anomaly detection: representation_based]](/img/95/1f7390ec024a2865acb9e9a41100b1.png)
Mspba [anomaly detection: representation_based]
随机推荐
Vscode ROS configuration GDB debugging error record
轮子六:QSerialPort 串口数据 收发
ES6 concept
Symbol符号类型
记录一下关于三角函数交换积分次序的一道题
Awk blank line filtering
Mysql8.0 cannot authorize users or prompt you are not allowed to create a user with grant
ES6, deep copy, shallow copy
es学习目录
CFA [anomaly detection: embedded_based]
Summary of C language learning content
Solve various problems of sudo rosdep init and rosdep update
Wechat applet uses canvas drawing, round avatar, network background, text, dotted line, straight line
STM32 board level support package for keys
Paddlenlp text classification based on ernir3.0: take wos dataset as an example (hierarchical classification)
770. 单词替换
imx6q gpio复用
Integrating database Ecology: using eventbridge to build CDC applications
How to use sprintf function
Leetcode exercise 3 - palindromes