当前位置:网站首页>YOLOX加强特征提取网络Panet分析
YOLOX加强特征提取网络Panet分析
2022-07-02 22:13:00 【牧羊女说】
在上一篇文章中,分享了YOLOX的CSPDarknet网络,详见YOLOX backbone——CSPDarknet的实现
在CSPDarknet中,有三个层次的输出, 分别是dark5(20x20x1024)、dark4(40x40x512)、dark3(80x80x256)。这三个层次的输出,会进入一个加强特征提取网络Panet,进一步进行特征提取,见下图红框标出来的部分:
Panet基本思想是,将深层特征进行上采样,并与浅层特征进行融合(见图上1~6标注部分),融合后的浅层特征再进行下采样,然后再与深层特征融合(见图上6~10部分)。
在YOLOX的官方实现代码上,Panet的实现在yolo_pafpn.py文件中的。结合上面数字标注,对官方代码进行了注释:
class YOLOPAFPN(nn.Module):
"""
YOLOv3 model. Darknet 53 is the default backbone of this model.
"""
def __init__(
self,
depth=1.0,
width=1.0,
in_features=("dark3", "dark4", "dark5"),
in_channels=[256, 512, 1024],
depthwise=False,
act="silu",
):
super().__init__()
self.backbone = CSPDarknet(depth, width, depthwise=depthwise, act=act)
self.in_features = in_features
self.in_channels = in_channels
Conv = DWConv if depthwise else BaseConv
self.upsample = nn.Upsample(scale_factor=2, mode="nearest")
# 20x20x1024 -> 20x20x512
self.lateral_conv0 = BaseConv(
int(in_channels[2] * width), int(in_channels[1] * width), 1, 1, act=act
)
# 40x40x1024 -> 40x40x512
self.C3_p4 = CSPLayer(
int(2 * in_channels[1] * width),
int(in_channels[1] * width),
round(3 * depth),
False,
depthwise=depthwise,
act=act,
) # cat
# 40x40x512 -> 40x40x256
self.reduce_conv1 = BaseConv(
int(in_channels[1] * width), int(in_channels[0] * width), 1, 1, act=act
)
# 80x80x512 -> 80x80x256
self.C3_p3 = CSPLayer(
int(2 * in_channels[0] * width), # 2x256
int(in_channels[0] * width), # 256
round(3 * depth),
False,
depthwise=depthwise,
act=act,
)
# bottom-up conv
# 80x80x256 -> 40x40x256
self.bu_conv2 = Conv(
int(in_channels[0] * width), int(in_channels[0] * width), 3, 2, act=act
)
# 40x40x512 -> 40x40x512
self.C3_n3 = CSPLayer(
int(2 * in_channels[0] * width), # 2*256
int(in_channels[1] * width), # 512
round(3 * depth),
False,
depthwise=depthwise,
act=act,
)
# bottom-up conv
# 40x40x512 -> 20x20x512
self.bu_conv1 = Conv(
int(in_channels[1] * width), int(in_channels[1] * width), 3, 2, act=act
)
# 20x20x1024 -> 20x20x1024
self.C3_n4 = CSPLayer(
int(2 * in_channels[1] * width), # 2*512
int(in_channels[2] * width), # 1024
round(3 * depth),
False,
depthwise=depthwise,
act=act,
)
def forward(self, input):
"""
Args:
inputs: input images.
Returns:
Tuple[Tensor]: FPN feature.
"""
# backbone
out_features = self.backbone(input)
features = [out_features[f] for f in self.in_features]
[x2, x1, x0] = features
# 第1步,对输出feature map进行卷积
# 20x20x1024 -> 20x20x512
fpn_out0 = self.lateral_conv0(x0) # 1024->512/32
# 第2步,对第1步中输出的feature map进行上采样
# Upsampling, 20x20x512 -> 40x40x512
f_out0 = self.upsample(fpn_out0) # 512/16
# 第3步,concat + CSP layer
# 40x40x512 + 40x40x512 -> 40x40x1024
f_out0 = torch.cat([f_out0, x1], 1) # 512->1024/16
# 40x40x1024 -> 40x40x512
f_out0 = self.C3_p4(f_out0) # 1024->512/16
# 第4步,对第3步输出的feature map进行卷积
# 40x40x512 -> 40x40x256
fpn_out1 = self.reduce_conv1(f_out0) # 512->256/16
# 第5步,继续上采样
# 40x40x256 -> 80x80x256
f_out1 = self.upsample(fpn_out1) # 256/8
# 第6步,concat+CSPLayer,输出到yolo head
# 80x80x256 + 80x80x256 -> 80x80x512
f_out1 = torch.cat([f_out1, x2], 1) # 256->512/8
# 80x80x512 -> 80x80x256
pan_out2 = self.C3_p3(f_out1) # 512->256/8
# 第7步,下采样
# 80x80x256 -> 40x40x256
p_out1 = self.bu_conv2(pan_out2) # 256->256/16
# 第8步,concat + CSPLayer, 输出到yolo head
# 40x40x256 + 40x40x256 = 40x40x512
p_out1 = torch.cat([p_out1, fpn_out1], 1) # 256->512/16
# 40x40x512 -> 40x40x512
pan_out1 = self.C3_n3(p_out1) # 512->512/16
# 第9步, 继续下采样
# 40x40x512 -> 20x20x512
p_out0 = self.bu_conv1(pan_out1) # 512->512/32
# 第10步,concat + CSPLayer, 输出到yolo head
# 20x20x512 + 20x20x512 -> 20x20x1024
p_out0 = torch.cat([p_out0, fpn_out0], 1) # 512->1024/32
# 20x20x1024 -> 20x20x1024
pan_out0 = self.C3_n4(p_out0) # 1024->1024/32
outputs = (pan_out2, pan_out1, pan_out0)
return outputs
参考:Pytorch 搭建自己的YoloX目标检测平台(Bubbliiiing 深度学习 教程)_哔哩哔哩_bilibili
边栏推荐
- Warning: implicitly declaring library function 'printf' with type 'int (const char *,...)‘
- 分布式监控系统zabbix
- Pytorch training CPU usage continues to grow (Bug)
- Strictly abide by the construction period and ensure the quality, this AI data annotation company has done it!
- Alibaba cloud award winning experience: how to use polardb-x
- Brief introduction of emotional dialogue recognition and generation
- Application of containerization technology in embedded field
- Static file display problem
- Potplayer set minimized shortcut keys
- golang入门:for...range修改切片中元素的值的另类方法
猜你喜欢
容器化技术在嵌入式领域的应用
BBR encounters cubic
(stinger) use pystinger Socks4 to go online and not go out of the network host
FOC矢量控制及BLDC控制中的端电压、相电压、线电压等概念别还傻傻分不清楚
Pandora IOT development board learning (HAL Library) - Experiment 4 serial port communication experiment (learning notes)
ServletContext learning diary 1
The first batch of Tencent cloud completed the first cloud native security maturity assessment in China
解决:exceptiole ‘xxxxx.QRTZ_LOCKS‘ doesn‘t exist以及mysql的my.cnf文件追加lower_case_table_names后启动报错
Set right click to select vs code to open the file
Redis 过期策略+conf 记录
随机推荐
Getting started with golang: for Range an alternative method of modifying the values of elements in slices
Cryptographic technology -- key and ssl/tls
The first batch of Tencent cloud completed the first cloud native security maturity assessment in China
[Solved] Splunk: Cannot get username when all users are selected“
Ping domain name error unknown host, NSLOOKUP / system d-resolve can be resolved normally, how to Ping the public network address?
Typical case of data annotation: how does jinglianwen technology help enterprises build data solutions
Sword finger offer II 099 Sum of minimum paths - double hundred code
深度剖析数据在内存中的存储----C语言篇
门牌制作 C语言
Makefile configuration of Hisilicon calling interface
Chow-Liu Tree
[Yangcheng cup 2020] easyphp
BBR encounters cubic
Hisilicon VI access video process
密码技术---分组密码的模式
The concepts of terminal voltage, phase voltage and line voltage in FOC vector control and BLDC control are still unclear
QT qpprogressbar details
Submit code process
Alibaba cloud award winning experience: how to use polardb-x
[hardware] origin of standard resistance value