当前位置:网站首页>FCN: Fully Convolutional Networks for Semantic Segmentation
FCN: Fully Convolutional Networks for Semantic Segmentation
2022-07-05 18:23:00 【00000cj】
paper: Fully Convolutional Networks for Semantic Segmentation
Innovation points
The structure of full convolution is proposed , That is, the last full connection layer of the classified network is replaced by the convolution layer , Thus, the input of any size can be processed .
Up sampling by deconvolution or interpolation , Restore the output back to the original input size .
Modify on the classification network , Replace the full connection layer with the convolution layer , You can share the weight of the previous layer , So as to carry out finetune.
Put forward skip structure , By combining shallow and deep features , It takes into account the shallow spatial details and deep semantic information , Make the final segmentation result more refined .
Implementation details analysis
Here we use MMSegmentation As an example , Compared with the original paper ,backbone from Vgg-16 Instead of ResNet-50,skip The structure is replaced by expansion convolution ,pytorch The official implementation is also like this .
Backbone
- The original ResNet-50 in 4 individual stage Of strides=(1, 2, 2, 2), Do not use expansion convolution, that is dilations=(1, 1, 1, 1), And in the FCN in 4 individual stage Of strides=(1, 2, 1, 1),dilations=(1, 1, 2, 4).
- There's another one contract_dilation=True Set up , That is, when the hole >1 when , Compress the first convolution . Here are the third and fourth stage One of the first bottleneck Halve the expansion rate , The third stage One of the first bottleneck Expansion convolution is not used in , The fourth one stage One of the first bottleneck in dilation=4/2=2.
- In addition, here we use ResNetV1c, namely stem Medium 7x7 Convolution is replaced by 3 individual 3x3 Convolution .
- Last , Pay attention to the padding, In the original implementation, except stem in 7x7 Convolution padding=3, Everything else padding=1. stay FCN Because of the expansion convolution , The latter two stage Of stride=1, In order to keep the input and output resolution always , From the following formula padding=dilation.

- hypothesis batch_size=4, Model input shape=(4, 3, 480, 480), be backbone four stage The outputs of are (4, 256, 120, 120)、(4, 512, 60, 60)、(4, 1024, 60, 60)、(4, 2048, 60, 60).
FCN Head
- take ResNet The fourth one stage Output (4, 2048, 60, 60), after Conv2d(2048, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)、Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) Two conv-bn-relu obtain (4, 512, 60, 60).
- The output of the previous step (4, 512, 60, 60) With the input (4, 2048, 60, 60) Spliced to get (4, 2560, 60, 60).
- Through a conv-bn-relu,Conv2d(2560, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False), obtain (4, 512, 60, 60).
- use dropout,dropout_ratio=0.1.
- Last , after Conv2d(512, num_classes, kernel_size=(1, 1), stride=(1, 1)) Get the final output of the model (4, num_classes, 60, 60), Note that the number of categories here includes the background .
Loss
- The output of the previous step (4, 2, 60, 60) After bilinear interpolation resize Input size , obtain (4, 2, 480, 480).
- use CrossEntropy loss
Auxiliary Head
- take ResNet Third stage Output (4, 1024, 60, 60), after Conv2d(1024, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) One conv-bn-relu obtain (4, 256, 60, 60).
- use dropout,dropout_ratio=0.1.
- after Conv2d(256, num_classes, kernel_size=(1, 1), stride=(1, 1)) Get the final output of the model (4, num_classes, 60, 60) Get the output of this branch .
边栏推荐
- [QNX hypervisor 2.2 user manual]6.3.2 configuring VM
- Introduction to Resampling
- 【PaddleClas】常用命令
- How to improve the thermal management in PCB design with the effective placement of thermal through holes?
- OpenShift常用管理命令杂记
- 图扑软件数字孪生 | 基于 BIM 技术的可视化管理系统
- @Extension、@SPI注解原理
- Copy the linked list with random pointer in the "Li Kou brush question plan"
- Memory leak of viewpager + recyclerview
- Fix vulnerability - mysql, ES
猜你喜欢

Sophon base 3.1 launched mlops function to provide wings for the operation of enterprise AI capabilities

Isprs2022 / Cloud Detection: Cloud Detection with Boundary nets Boundary Networks Based Cloud Detection

Sophon Base 3.1 推出MLOps功能,为企业AI能力运营插上翅膀

《力扣刷题计划》复制带随机指针的链表

Record eval() and no in pytoch_ grad()

How to obtain the coordinates of the aircraft passing through both ends of the radar

记一次使用Windbg分析内存“泄漏”的案例

rust统计文件中单词出现的次数

彻底理解为什么网络 I/O 会被阻塞?

图像分类,看我就够啦!
随机推荐
图像分类,看我就够啦!
English sentence pattern reference
个人对卷积神经网络的理解
从XML架构生成类
Electron安装问题
About statistical power
星环科技数据安全管理平台 Defensor重磅发布
Use of print function in MATLAB
[use electron to develop desktop on youqilin]
Electron installation problems
Gimp 2.10 tutorial "suggestions collection"
Logical words in Articles
图扑软件数字孪生 | 基于 BIM 技术的可视化管理系统
分享:中兴 远航 30 pro root 解锁BL magisk ZTE 7532N 8040N 9041N 刷机 刷面具原厂刷机包 root方法下载
让更多港澳青年了解南沙特色文创产品!“南沙麒麟”正式亮相
[QNX hypervisor 2.2 user manual]6.3.2 configuring VM
Fix vulnerability - mysql, ES
Memory leak of viewpager + recyclerview
RPC协议详解
How to obtain the coordinates of the aircraft passing through both ends of the radar