当前位置:网站首页>Improvement 11 of yolov5: replace backbone network C3 with lightweight network mobilenetv3
Improvement 11 of yolov5: replace backbone network C3 with lightweight network mobilenetv3
2022-07-28 22:49:00 【Artificial Intelligence Algorithm Research Institute】
front said : As the current advanced deep learning target detection algorithm YOLOv5, A large number of trick, But there is still room for improvement , For the detection difficulties in specific application scenarios , There are different ways to improve . Subsequent articles , Focus on YOLOv5 How to improve is introduced in detail , The purpose is to provide their own meager help and reference for those who need innovation in scientific research or friends who need to achieve better results in engineering projects .
solve the problem :YOLOv5 The backbone feature extraction network adopts C3 structure , Bring a large number of parameters , The detection speed is slow , Limited application , In some real application scenarios, such as mobile or embedded devices , Such a large and complex model is difficult to be applied . The first is that the model is too large , Facing the problem of insufficient memory , Second, these scenarios require low latency , In other words, the response speed should be fast , Imagine the pedestrian detection system of self driving cars. What terrible things will happen if the speed is slow? . therefore , Research small and efficient CNN Models are crucial in these scenarios , At least for now , Although the hardware will be faster and faster in the future . This paper attempts to replace the backbone feature extraction network with a lighter MobileNet The Internet , To realize the lightweight of the network model , Balance speed and accuracy .
principle :
Address of thesis :https://arxiv.org/abs/1905.02244.pdf
generation code :https://github.com/LeBron-Jian/DeepLearningNote
MobileNet V3 The related technologies are as follows :
1, use MnasNet Search network structure
2, use V1 The depth is separable
3, use V2 The inverted residual linear bottleneck structure of
4, introduce SE modular
5, New activation function h-swish(x)
6, Two strategies are used in Web Search : Resource constrained NAS and NetAdapt
7, modify V2 The last part reduces the calculation

Fang Law :
Step 1 modify common.py, increase MobileNetV3 modular .
class StemBlock(nn.Module):
def __init__(self, c1, c2, k=3, s=2, p=None, g=1, act=True):
super(StemBlock, self).__init__()
self.stem_1 = Conv(c1, c2, k, s, p, g, act)
self.stem_2a = Conv(c2, c2 // 2, 1, 1, 0)
self.stem_2b = Conv(c2 // 2, c2, 3, 2, 1)
self.stem_2p = nn.MaxPool2d(kernel_size=2, stride=2, ceil_mode=True)
self.stem_3 = Conv(c2 * 2, c2, 1, 1, 0)
def forward(self, x):
stem_1_out = self.stem_1(x)
stem_2a_out = self.stem_2a(stem_1_out)
stem_2b_out = self.stem_2b(stem_2a_out)
stem_2p_out = self.stem_2p(stem_1_out)
out = self.stem_3(torch.cat((stem_2b_out, stem_2p_out), 1))
return out
class h_swish(nn.Module):
def __init__(self, inplace=True):
super(h_swish, self).__init__()
self.sigmoid = h_sigmoid(inplace=inplace)
def forward(self, x):
y = self.sigmoid(x)
return x * y
class SELayer(nn.Module):
def __init__(self, channel, reduction=4):
super(SELayer, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d(1)
self.fc = nn.Sequential(
nn.Linear(channel, channel // reduction),
nn.ReLU(inplace=True),
nn.Linear(channel // reduction, channel),
h_sigmoid()
)
def forward(self, x):
b, c, _, _ = x.size()
y = self.avg_pool(x)
y = y.view(b, c)
y = self.fc(y).view(b, c, 1, 1)
return x * y
class conv_bn_hswish(nn.Module):
"""
This equals to
def conv_3x3_bn(inp, oup, stride):
return nn.Sequential(
nn.Conv2d(inp, oup, 3, stride, 1, bias=False),
nn.BatchNorm2d(oup),
h_swish()
)
"""
def __init__(self, c1, c2, stride):
super(conv_bn_hswish, self).__init__()
self.conv = nn.Conv2d(c1, c2, 3, stride, 1, bias=False)
self.bn = nn.BatchNorm2d(c2)
self.act = h_swish()
def forward(self, x):
return self.act(self.bn(self.conv(x)))
def fuseforward(self, x):
return self.act(self.conv(x))
class MobileNetV3_InvertedResidual(nn.Module):
def __init__(self, inp, oup, hidden_dim, kernel_size, stride, use_se, use_hs):
super(MobileNetV3_InvertedResidual, self).__init__()
assert stride in [1, 2]
self.identity = stride == 1 and inp == oup
if inp == hidden_dim:
self.conv = nn.Sequential(
# dw
nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
bias=False),
nn.BatchNorm2d(hidden_dim),
h_swish() if use_hs else nn.ReLU(inplace=True),
# Squeeze-and-Excite
SELayer(hidden_dim) if use_se else nn.Sequential(),
# Eca_layer(hidden_dim) if use_se else nn.Sequential(),#1.13.2022
# pw-linear
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
)
else:
self.conv = nn.Sequential(
# pw
nn.Conv2d(inp, hidden_dim, 1, 1, 0, bias=False),
nn.BatchNorm2d(hidden_dim),
h_swish() if use_hs else nn.ReLU(inplace=True),
# dw
nn.Conv2d(hidden_dim, hidden_dim, kernel_size, stride, (kernel_size - 1) // 2, groups=hidden_dim,
bias=False),
nn.BatchNorm2d(hidden_dim),
# Squeeze-and-Excite
SELayer(hidden_dim) if use_se else nn.Sequential(),
# Eca_layer(hidden_dim) if use_se else nn.Sequential(), # 1.13.2022
h_swish() if use_hs else nn.ReLU(inplace=True),
# pw-linear
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
)
def forward(self, x):
y = self.conv(x)
if self.identity:
return x + y
else:
return yThe second step : take yolo.py Registration module in .
if m in [Conv,MobileNetV3_InvertedResidual,ShuffleNetV2_InvertedResidual, ]:
The third step : modify yaml file
backbone:
# MobileNetV3-large
# [from, number, module, args]
[[-1, 1, conv_bn_hswish, [16, 2]], # 0-p1/2
[-1, 1, MobileNetV3_InvertedResidual, [ 16, 16, 3, 1, 0, 0]], # 1-p1/2
[-1, 1, MobileNetV3_InvertedResidual, [ 24, 64, 3, 2, 0, 0]], # 2-p2/4
[-1, 1, MobileNetV3_InvertedResidual, [ 24, 72, 3, 1, 0, 0]], # 3-p2/4
[-1, 1, MobileNetV3_InvertedResidual, [ 40, 72, 5, 2, 1, 0]], # 4-p3/8
[-1, 1, MobileNetV3_InvertedResidual, [ 40, 120, 5, 1, 1, 0]], # 5-p3/8
[-1, 1, MobileNetV3_InvertedResidual, [ 40, 120, 5, 1, 1, 0]], # 6-p3/8
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 240, 3, 2, 0, 1]], # 7-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 200, 3, 1, 0, 1]], # 8-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 184, 3, 1, 0, 1]], # 9-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [ 80, 184, 3, 1, 0, 1]], # 10-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [112, 480, 3, 1, 1, 1]], # 11-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [112, 672, 3, 1, 1, 1]], # 12-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [160, 672, 5, 1, 1, 1]], # 13-p4/16
[-1, 1, MobileNetV3_InvertedResidual, [160, 960, 5, 2, 1, 1]], # 14-p5/32 primary 672 Change to the original algorithm 960
[-1, 1, MobileNetV3_InvertedResidual, [160, 960, 5, 1, 1, 1]], # 15-p5/32
]junction fruit : I have done a lot of experiments on multiple data sets , For different data sets, the effect is different ,map Value down , But the size of the weight model decreases , The parameter quantity decreases .
Let me know : The next content will continue to share the sharing of network lightweight methods . Interested friends can pay attention to me , If you have questions, you can leave a message or chat with me in private
PS: The replacement of dry network is not only applicable to improvement YOLOv5, You can also improve others YOLO Network and target detection network , such as YOLOv4、v3 etc. .
Last , I hope I can powder each other , Be a friend , Learn and communicate together .
边栏推荐
- 基于Ernie-3.0 CAIL2019法研杯要素识别多标签分类任务
- ES6, deep copy, shallow copy
- Configuration and official document of Freia library [tips]
- Mspba [anomaly detection: representation_based]
- Summary of recent bugs
- Summary of common error types in JS
- STM32 - systick timer (cubemx configures systick)
- Summary of C language learning content
- Stm32+ four pin OLED screen + Chinese character mold taking
- Evaluation index of anomaly detection: rocauc et al. [tips]
猜你喜欢

Simple es highlight practice

Att & CK Threat Intelligence

JVM -- custom class loader
![MKD [anomaly detection: knowledge disruption]](/img/15/10f5c8d6851e94dac764517c488dbc.png)
MKD [anomaly detection: knowledge disruption]

使用PCL批量显示PCD点云数据流

The blueprint of flask complements openpyxl

winform跳转第二个窗体案例

When can I sign up for the 2022 class I constructor examination?

Use PCL to batch convert point cloud.Bin files to.Pcd
Integrating database Ecology: using eventbridge to build CDC applications
随机推荐
OSV-q The size of tensor a (3) must match the size of tensor b (320) at non-singleton dimension 3
轮子七:TCP客户端
20-09-27 the project is migrated to Alibaba toss record (the network card order makes the service unable to connect to DB through haproxy)
770. 单词替换
Vscode ROS configuration GDB debugging error record
C语言学习内容总结
使用PCL批量显示PCD点云数据流
Intelligent control -- fuzzy mathematics and control
《Shortening passengers’ travel time A dynamic metro train scheduling approach using deep reinforcem》
Target segmentation learning
【三维目标检测】3DSSD(二)
STM32 board level support package for keys
二进制的原码、反码、补码
842. Arrange numbers
OSV-q ValueError: axes don‘t match array
CS flow [abnormal detection: normalizing flow]
Install PCL and VTK under the background of ROS installation, and solve VTK and PCL_ ROS conflict problem
When can I sign up for the 2022 class I constructor examination?
[connect your mobile phone wirelessly] - debug your mobile device wirelessly via LAN
Padim [anomaly detection: embedded based]