当前位置:网站首页>FCN: Fully Convolutional Networks for Semantic Segmentation
FCN: Fully Convolutional Networks for Semantic Segmentation
2022-07-05 18:23:00 【00000cj】
paper: Fully Convolutional Networks for Semantic Segmentation
Innovation points
The structure of full convolution is proposed , That is, the last full connection layer of the classified network is replaced by the convolution layer , Thus, the input of any size can be processed .
Up sampling by deconvolution or interpolation , Restore the output back to the original input size .
Modify on the classification network , Replace the full connection layer with the convolution layer , You can share the weight of the previous layer , So as to carry out finetune.
Put forward skip structure , By combining shallow and deep features , It takes into account the shallow spatial details and deep semantic information , Make the final segmentation result more refined .
Implementation details analysis
Here we use MMSegmentation As an example , Compared with the original paper ,backbone from Vgg-16 Instead of ResNet-50,skip The structure is replaced by expansion convolution ,pytorch The official implementation is also like this .
Backbone
- The original ResNet-50 in 4 individual stage Of strides=(1, 2, 2, 2), Do not use expansion convolution, that is dilations=(1, 1, 1, 1), And in the FCN in 4 individual stage Of strides=(1, 2, 1, 1),dilations=(1, 1, 2, 4).
- There's another one contract_dilation=True Set up , That is, when the hole >1 when , Compress the first convolution . Here are the third and fourth stage One of the first bottleneck Halve the expansion rate , The third stage One of the first bottleneck Expansion convolution is not used in , The fourth one stage One of the first bottleneck in dilation=4/2=2.
- In addition, here we use ResNetV1c, namely stem Medium 7x7 Convolution is replaced by 3 individual 3x3 Convolution .
- Last , Pay attention to the padding, In the original implementation, except stem in 7x7 Convolution padding=3, Everything else padding=1. stay FCN Because of the expansion convolution , The latter two stage Of stride=1, In order to keep the input and output resolution always , From the following formula padding=dilation.
- hypothesis batch_size=4, Model input shape=(4, 3, 480, 480), be backbone four stage The outputs of are (4, 256, 120, 120)、(4, 512, 60, 60)、(4, 1024, 60, 60)、(4, 2048, 60, 60).
FCN Head
- take ResNet The fourth one stage Output (4, 2048, 60, 60), after Conv2d(2048, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)、Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) Two conv-bn-relu obtain (4, 512, 60, 60).
- The output of the previous step (4, 512, 60, 60) With the input (4, 2048, 60, 60) Spliced to get (4, 2560, 60, 60).
- Through a conv-bn-relu,Conv2d(2560, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False), obtain (4, 512, 60, 60).
- use dropout,dropout_ratio=0.1.
- Last , after Conv2d(512, num_classes, kernel_size=(1, 1), stride=(1, 1)) Get the final output of the model (4, num_classes, 60, 60), Note that the number of categories here includes the background .
Loss
- The output of the previous step (4, 2, 60, 60) After bilinear interpolation resize Input size , obtain (4, 2, 480, 480).
- use CrossEntropy loss
Auxiliary Head
- take ResNet Third stage Output (4, 1024, 60, 60), after Conv2d(1024, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) One conv-bn-relu obtain (4, 256, 60, 60).
- use dropout,dropout_ratio=0.1.
- after Conv2d(256, num_classes, kernel_size=(1, 1), stride=(1, 1)) Get the final output of the model (4, num_classes, 60, 60) Get the output of this branch .
边栏推荐
- 第十一届中国云计算标准和应用大会 | 华云数据成为全国信标委云计算标准工作组云迁移专题组副组长单位副组长单位
- How to improve the thermal management in PCB design with the effective placement of thermal through holes?
- FCN: Fully Convolutional Networks for Semantic Segmentation
- Memory leak of viewpager + recyclerview
- @Extension、@SPI注解原理
- 破解湖+仓混合架构顽疾,星环科技推出自主可控云原生湖仓一体平台
- Logical words in Articles
- 让更多港澳青年了解南沙特色文创产品!“南沙麒麟”正式亮相
- nano的CAN通信
- LeetCode 6111. Spiral matrix IV
猜你喜欢
Share: ZTE Yuanhang 30 Pro root unlock BL magick ZTE 7532n 8040n 9041n brush mask original brush package root method Download
模拟百囚徒问题
Simulate the hundred prisoner problem
node_exporter内存使用率不显示
吴恩达团队2022机器学习课程,来啦
How to obtain the coordinates of the aircraft passing through both ends of the radar
Pytorch yolov5 training custom data
LeetCode 6109. 知道秘密的人数
nano的CAN通信
Trust counts the number of occurrences of words in the file
随机推荐
JVM第三话 -- JVM性能调优实战和高频面试题记录
Simulate the hundred prisoner problem
Penetrate the whole intranet through socks agent
写作写作写作写作
个人对卷积神经网络的理解
ClickHouse(03)ClickHouse怎么安装和部署
Image classification, just look at me!
吳恩達團隊2022機器學習課程,來啦
Nacos distributed transactions Seata * * install JDK on Linux, mysql5.7 start Nacos configure ideal call interface coordination (nanny level detail tutorial)
About statistical power
检查命名空间和类
Sibling components carry out value transfer (there is a sequence displayed)
记一次使用Windbg分析内存“泄漏”的案例
GIMP 2.10教程「建议收藏」
RPC协议详解
Sophon KG升级3.1:打破数据间壁垒,解放企业生产力
[PM2 details]
LeetCode 6111. 螺旋矩阵 IV
小林coding的内存管理章节
南京大学:新时代数字化人才培养方案探讨