当前位置:网站首页>FCN: Fully Convolutional Networks for Semantic Segmentation
FCN: Fully Convolutional Networks for Semantic Segmentation
2022-07-05 18:23:00 【00000cj】
paper: Fully Convolutional Networks for Semantic Segmentation
Innovation points
The structure of full convolution is proposed , That is, the last full connection layer of the classified network is replaced by the convolution layer , Thus, the input of any size can be processed .
Up sampling by deconvolution or interpolation , Restore the output back to the original input size .
Modify on the classification network , Replace the full connection layer with the convolution layer , You can share the weight of the previous layer , So as to carry out finetune.
Put forward skip structure , By combining shallow and deep features , It takes into account the shallow spatial details and deep semantic information , Make the final segmentation result more refined .
Implementation details analysis
Here we use MMSegmentation As an example , Compared with the original paper ,backbone from Vgg-16 Instead of ResNet-50,skip The structure is replaced by expansion convolution ,pytorch The official implementation is also like this .
Backbone
- The original ResNet-50 in 4 individual stage Of strides=(1, 2, 2, 2), Do not use expansion convolution, that is dilations=(1, 1, 1, 1), And in the FCN in 4 individual stage Of strides=(1, 2, 1, 1),dilations=(1, 1, 2, 4).
- There's another one contract_dilation=True Set up , That is, when the hole >1 when , Compress the first convolution . Here are the third and fourth stage One of the first bottleneck Halve the expansion rate , The third stage One of the first bottleneck Expansion convolution is not used in , The fourth one stage One of the first bottleneck in dilation=4/2=2.
- In addition, here we use ResNetV1c, namely stem Medium 7x7 Convolution is replaced by 3 individual 3x3 Convolution .
- Last , Pay attention to the padding, In the original implementation, except stem in 7x7 Convolution padding=3, Everything else padding=1. stay FCN Because of the expansion convolution , The latter two stage Of stride=1, In order to keep the input and output resolution always , From the following formula padding=dilation.

- hypothesis batch_size=4, Model input shape=(4, 3, 480, 480), be backbone four stage The outputs of are (4, 256, 120, 120)、(4, 512, 60, 60)、(4, 1024, 60, 60)、(4, 2048, 60, 60).
FCN Head
- take ResNet The fourth one stage Output (4, 2048, 60, 60), after Conv2d(2048, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)、Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) Two conv-bn-relu obtain (4, 512, 60, 60).
- The output of the previous step (4, 512, 60, 60) With the input (4, 2048, 60, 60) Spliced to get (4, 2560, 60, 60).
- Through a conv-bn-relu,Conv2d(2560, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False), obtain (4, 512, 60, 60).
- use dropout,dropout_ratio=0.1.
- Last , after Conv2d(512, num_classes, kernel_size=(1, 1), stride=(1, 1)) Get the final output of the model (4, num_classes, 60, 60), Note that the number of categories here includes the background .
Loss
- The output of the previous step (4, 2, 60, 60) After bilinear interpolation resize Input size , obtain (4, 2, 480, 480).
- use CrossEntropy loss
Auxiliary Head
- take ResNet Third stage Output (4, 1024, 60, 60), after Conv2d(1024, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) One conv-bn-relu obtain (4, 256, 60, 60).
- use dropout,dropout_ratio=0.1.
- after Conv2d(256, num_classes, kernel_size=(1, 1), stride=(1, 1)) Get the final output of the model (4, num_classes, 60, 60) Get the output of this branch .
边栏推荐
- 检查命名空间和类
- nano的CAN通信
- [PM2 details]
- 图扑软件数字孪生 | 基于 BIM 技术的可视化管理系统
- Sophon autocv: help AI industrial production and realize visual intelligent perception
- Fix vulnerability - mysql, ES
- Nacos distributed transactions Seata * * install JDK on Linux, mysql5.7 start Nacos configure ideal call interface coordination (nanny level detail tutorial)
- 吳恩達團隊2022機器學習課程,來啦
- Introduction to the development function of Hanlin Youshang system of Hansheng Youpin app
- Crontab 日志:如何记录我的 Cron 脚本的输出
猜你喜欢

第十届全球云计算大会 | 华云数据荣获“2013-2022十周年特别贡献奖”

模拟百囚徒问题

《2022中国信创生态市场研究及选型评估报告》发布 华云数据入选信创IT基础设施主流厂商!

JVM third talk -- JVM performance tuning practice and high-frequency interview question record

LeetCode 6109. 知道秘密的人数

Sophon KG升级3.1:打破数据间壁垒,解放企业生产力

寻找第k小元素 前k小元素 select_k

Share: ZTE Yuanhang 30 Pro root unlock BL magick ZTE 7532n 8040n 9041n brush mask original brush package root method Download

隐私计算助力数据的安全流通与共享
![最大人工岛[如何让一个连通分量的所有节点都记录总节点数?+给连通分量编号]](/img/8b/a60fc36115580f018445e4c2a28a9d.png)
最大人工岛[如何让一个连通分量的所有节点都记录总节点数?+给连通分量编号]
随机推荐
通过SOCKS代理渗透整个内网
Clickhouse (03) how to install and deploy Clickhouse
快速生成ipa包
【PaddleClas】常用命令
Generate classes from XML schema
What is the reason why the video cannot be played normally after the easycvr access device turns on the audio?
图片数据不够?我做了一个免费的图像增强软件
Multithreading (I) processes and threads
Is it safe to open an account, register and dig money? Is there any risk? Is it reliable?
Sophon autocv: help AI industrial production and realize visual intelligent perception
Sophon CE Community Edition is online, and free get is a lightweight, easy-to-use, efficient and intelligent data analysis tool
Logical words in Articles
开户注册股票炒股安全吗?有没有风险的?靠谱吗?
LeetCode笔记:Weekly Contest 300
开户注册挖财安全吗?有没有风险的?靠谱吗?
【HCIA-cloud】【1】云计算的定义、什么是云计算、云计算的架构与技术说明、华为云计算产品、华为内存DDR配置工具说明
彻底理解为什么网络 I/O 会被阻塞?
Sophon kg upgrade 3.1: break down barriers between data and liberate enterprise productivity
Electron installation problems
Use JMeter to record scripts and debug