当前位置:网站首页>[semantic segmentation] full attention network for semantic segmentation
[semantic segmentation] full attention network for semantic segmentation
2022-07-29 06:03:00 【Dull cat】
List of articles

This article is included in AAAI2022
One 、 Background and motivation
In semantic segmentation ,non-local (NL) The method plays a good role in capturing long-range The role of information , It can be roughly divided into Channel non-local and Spatial non-local Two variants . But they also have a problem ——attention missing.
With channel attention For example :channel attention Be able to find the association between each channel and other channels , In the process of calculation ,spatial Features are integrated , The connection between different positions is missing
With spatial attention For example :spatial attention Be able to find the relationship between each location , But all the channel The features of are also integrated , Missing the difference channel The connection between
The author believes that this attention missing The problem will weaken the storage of 3D context information , These two kinds of attention Each method has its own disadvantages .
To test this conjecture , The author is in the picture 2 Drew cityscapes The accuracy of each category of the validation set .
1、 Single attention The effect of :
channel NL Good performance on big goals , Such as truck,bus,train
spatial NL Good performance on small targets or slender targets , Such as poles,rider,mbike etc.
There are also some categories , Both performed poorly
2、 Reunite with attention The effect of :
To verify two attention The effect of using it at the same time , The author also uses parallel DANet (Dual NL) And in series channel-spatial NL (CS NL) Contrast .
When using two NL When , The accuracy growth of each category is no less than that of using a single NL
however ,Dual NL On a large scale (truck,train) The performance on the decreases a lot ,CS NL In the slender category (pole,mbike) Upper IoU Poor .
So the author thinks , Composite attention structure , Only from channel and spatial attention To improve performance , therefore , attention attention missing The problem impairs the ability to express features , You can't simply stack different NL Module to solve .

Inspired by this , The author puts forward a new one non-local modular ——Fully Attention block(FLA) To effectively attention Feature retention .
Two 、 Method
The basic idea of this paper is :
In the calculation channel attention map when , Use the global context feature to save the spatial response feature , This can make in a single attention Full attention and high computational efficiency are achieved in , chart 1c It is an integral structure .
- First , Let each spatial location capture the characteristic response of the global context
- after , Use self-attention Mechanism to capture two channel Full attention similarity between and corresponding spatial location
- Last , Use full attention similarity to channel map Conduct re-weight.
① channel NL The process is as shown in the figure 1a Shown , Generated attention map by C × C C\times C C×C size , That is, the attention weight of each channel and all other channels .
② spatial NL The process is as shown in the figure 1b Shown , Generated attention map by H W × H W HW \times HW HW×HW size , That is, the attention weight of each point and all other pixels .
③ FLA The process is as shown in the figure 1c Shown , Generated attention map by ( C × C ) ( H + W ) (C\times C)(H+W) (C×C)(H+W), among C × C C\times C C×C Or channel attention weight , ( H + W ) (H+W) (H+W) Is every line ( common H That's ok ) And each column ( common W Column ) The weight of attention .
Fully Attentional Block(FLA) structure :
FLA Structure is shown in figure 3 Shown , Input as characteristic graph F i n ∈ R C × H × W F_{in}\in R^{C\times H \times W} Fin∈RC×H×W
Generate Q:
- First , Input the characteristic graph into the lowest path of the following parallel path (construction), Both of these parallel paths are composed of global average pooling + Linear Composed of , In order to get different output , The author also chose different pooling nucleus .
- In order to get rich global context information , The author uses cores with different vertical and horizontal , That is, the rectangular core
- In order to keep information interaction between each spatial position and the corresponding position on the same horizontal axis or the same vertical axis , That is to calculate channel Maintain spatial continuity in the relationship between , The author generates Q The two parallel channels of... Are selected respectively pooling window The sizes are H × 1 H\times 1 H×1 and 1 × W 1\times W 1×W The core of .
- Get different dimensions Q W Q_W QW and Q H Q_H QH after , Expand horizontally and vertically , Get features with the same dimension . It is precisely because of this feature extraction of different dimensions , The spatial characteristics of the corresponding dimensions can be preserved .
- Last , Cut and fuse the two features
Generate K:
Put the input feature in H Dimension segmentation , Cut into H slice , Every slides The size is R C × W R^{C\times W} RC×W, And then Q Conduct merge, obtain 
then , You can get the attention weight of pixels in the same column of each position and its peers , Get full attention map A ∈ R ( H + W ) × C × C A \in R^{(H+W) \times C \times C} A∈R(H+W)×C×C.

- A i , j A_{i,j} Ai,j It's in a specific position i t h i^{th} ith and j t h j^{th} jth channel Degree of relevance
Generate V:

Put the input feature in W Dimension segmentation , Cut into W slice , Every slides The size is R C × H R^{C\times H} RC×H, and A After matrix multiplication , Get through attention After the characteristics of


3、 ... and 、 effect



visualization :
channel NL and FLA It is prominent in the semantic area , And they are relatively continuous within the big goal .
FLA Attention characteristic graph ratio channel The attention feature map of is more neat and delicate , As far away pole and The boundary of the target .
FLA It can also better distinguish different categories , Like the third line bus and car
And that proves it , Proposed by the author FLA Can be in channel attention Capture and use spatial similarity inside the feature map , To achieve full attention.

边栏推荐
- Operation commands in anaconda, such as removing old environment, adding new environment, viewing environment, installing library, cleaning cache, etc
- ASM插桩:学完ASM Tree api,再也不用怕hook了
- Flink connector Oracle CDC synchronizes data to MySQL in real time (oracle19c)
- 个人学习网站
- 简单聊聊 PendingIntent 与 Intent 的区别
- 【网络设计】ConvNeXt:A ConvNet for the 2020s
- Anr Optimization: cause oom crash and corresponding solutions
- 【bug】XLRDError: Excel xlsx file; not supported
- 【目标检测】KL-Loss:Bounding Box Regression with Uncertainty for Accurate Object Detection
- 【语义分割】Fully Attentional Network for Semantic Segmentation
猜你喜欢

主流实时流处理计算框架Flink初体验。

通过简单的脚本在Linux环境实现Mysql数据库的定时备份(Mysqldump命令备份)

DataX installation

datax安装

ANR优化:导致 OOM 崩溃及相对应的解决方案

Tear the ORM framework by hand (generic + annotation + reflection)

Thinkphp6 output QR code image format to solve the conflict with debug

Flutter 绘制技巧探索:一起画箭头(技巧拓展)

Training log 4 of the project "construction of Shandong University mobile Internet development technology teaching website"

深入理解MMAP原理,让大厂都爱不释手的技术
随机推荐
ASM插桩:学完ASM Tree api,再也不用怕hook了
【ML】机器学习模型之PMML--概述
xtrabackup 的使用
How to obtain openid of wechat applet in uni app project
datax安装
[overview] image classification network
Training log 4 of the project "construction of Shandong University mobile Internet development technology teaching website"
主流实时流处理计算框架Flink初体验。
Realize the scheduled backup of MySQL database in Linux environment through simple script (mysqldump command backup)
How to make interesting apps for deep learning with zero code (suitable for novices)
30 knowledge points that must be mastered in quantitative development [what is level-2 data]
Activity交互问题,你确定都知道?
Android Studio 实现登录注册-源代码 (连接MySql数据库)
Super simple integration HMS ml kit face detection to achieve cute stickers
并发编程学习笔记 之 原子操作类AtomicReference、AtomicStampedReference详解
Super simple integration of HMS ml kit to realize parent control
SQL repair duplicate data
D3.JS 纵向关系图(加箭头,连接线文字描述)
Huawei 2020 school recruitment written test programming questions read this article is enough (Part 2)
在uni-app项目中,如何实现微信小程序openid的获取