当前位置:网站首页>[semantic segmentation] full attention network for semantic segmentation
[semantic segmentation] full attention network for semantic segmentation
2022-07-29 06:03:00 【Dull cat】
List of articles

This article is included in AAAI2022
One 、 Background and motivation
In semantic segmentation ,non-local (NL) The method plays a good role in capturing long-range The role of information , It can be roughly divided into Channel non-local and Spatial non-local Two variants . But they also have a problem ——attention missing.
With channel attention For example :channel attention Be able to find the association between each channel and other channels , In the process of calculation ,spatial Features are integrated , The connection between different positions is missing
With spatial attention For example :spatial attention Be able to find the relationship between each location , But all the channel The features of are also integrated , Missing the difference channel The connection between
The author believes that this attention missing The problem will weaken the storage of 3D context information , These two kinds of attention Each method has its own disadvantages .
To test this conjecture , The author is in the picture 2 Drew cityscapes The accuracy of each category of the validation set .
1、 Single attention The effect of :
channel NL Good performance on big goals , Such as truck,bus,train
spatial NL Good performance on small targets or slender targets , Such as poles,rider,mbike etc.
There are also some categories , Both performed poorly
2、 Reunite with attention The effect of :
To verify two attention The effect of using it at the same time , The author also uses parallel DANet (Dual NL) And in series channel-spatial NL (CS NL) Contrast .
When using two NL When , The accuracy growth of each category is no less than that of using a single NL
however ,Dual NL On a large scale (truck,train) The performance on the decreases a lot ,CS NL In the slender category (pole,mbike) Upper IoU Poor .
So the author thinks , Composite attention structure , Only from channel and spatial attention To improve performance , therefore , attention attention missing The problem impairs the ability to express features , You can't simply stack different NL Module to solve .

Inspired by this , The author puts forward a new one non-local modular ——Fully Attention block(FLA) To effectively attention Feature retention .
Two 、 Method
The basic idea of this paper is :
In the calculation channel attention map when , Use the global context feature to save the spatial response feature , This can make in a single attention Full attention and high computational efficiency are achieved in , chart 1c It is an integral structure .
- First , Let each spatial location capture the characteristic response of the global context
- after , Use self-attention Mechanism to capture two channel Full attention similarity between and corresponding spatial location
- Last , Use full attention similarity to channel map Conduct re-weight.
① channel NL The process is as shown in the figure 1a Shown , Generated attention map by C × C C\times C C×C size , That is, the attention weight of each channel and all other channels .
② spatial NL The process is as shown in the figure 1b Shown , Generated attention map by H W × H W HW \times HW HW×HW size , That is, the attention weight of each point and all other pixels .
③ FLA The process is as shown in the figure 1c Shown , Generated attention map by ( C × C ) ( H + W ) (C\times C)(H+W) (C×C)(H+W), among C × C C\times C C×C Or channel attention weight , ( H + W ) (H+W) (H+W) Is every line ( common H That's ok ) And each column ( common W Column ) The weight of attention .
Fully Attentional Block(FLA) structure :
FLA Structure is shown in figure 3 Shown , Input as characteristic graph F i n ∈ R C × H × W F_{in}\in R^{C\times H \times W} Fin∈RC×H×W
Generate Q:
- First , Input the characteristic graph into the lowest path of the following parallel path (construction), Both of these parallel paths are composed of global average pooling + Linear Composed of , In order to get different output , The author also chose different pooling nucleus .
- In order to get rich global context information , The author uses cores with different vertical and horizontal , That is, the rectangular core
- In order to keep information interaction between each spatial position and the corresponding position on the same horizontal axis or the same vertical axis , That is to calculate channel Maintain spatial continuity in the relationship between , The author generates Q The two parallel channels of... Are selected respectively pooling window The sizes are H × 1 H\times 1 H×1 and 1 × W 1\times W 1×W The core of .
- Get different dimensions Q W Q_W QW and Q H Q_H QH after , Expand horizontally and vertically , Get features with the same dimension . It is precisely because of this feature extraction of different dimensions , The spatial characteristics of the corresponding dimensions can be preserved .
- Last , Cut and fuse the two features
Generate K:
Put the input feature in H Dimension segmentation , Cut into H slice , Every slides The size is R C × W R^{C\times W} RC×W, And then Q Conduct merge, obtain 
then , You can get the attention weight of pixels in the same column of each position and its peers , Get full attention map A ∈ R ( H + W ) × C × C A \in R^{(H+W) \times C \times C} A∈R(H+W)×C×C.

- A i , j A_{i,j} Ai,j It's in a specific position i t h i^{th} ith and j t h j^{th} jth channel Degree of relevance
Generate V:

Put the input feature in W Dimension segmentation , Cut into W slice , Every slides The size is R C × H R^{C\times H} RC×H, and A After matrix multiplication , Get through attention After the characteristics of


3、 ... and 、 effect



visualization :
channel NL and FLA It is prominent in the semantic area , And they are relatively continuous within the big goal .
FLA Attention characteristic graph ratio channel The attention feature map of is more neat and delicate , As far away pole and The boundary of the target .
FLA It can also better distinguish different categories , Like the third line bus and car
And that proves it , Proposed by the author FLA Can be in channel attention Capture and use spatial similarity inside the feature map , To achieve full attention.

边栏推荐
- asyncawait和promise的区别
- Huawei 2020 school recruitment written test programming questions read this article is enough (Part 2)
- The difference between asyncawait and promise
- 【比赛网站】收集机器学习/深度学习比赛网站(持续更新)
- MySql统计函数COUNT详解
- 【图像分类】如何使用 mmclassification 训练自己的分类模型
- Simple optimization of interesting apps for deep learning (suitable for novices)
- Spring, summer, autumn and winter with Miss Zhang (1)
- 数组的基础使用--遍历循环数组求出数组最大值,最小值以及最大值下标,最小值下标
- Lock lock of concurrent programming learning notes and its implementation basic usage of reentrantlock, reentrantreadwritelock and stampedlock
猜你喜欢

Training log 7 of the project "construction of Shandong University mobile Internet development technology teaching website"

D3.JS 纵向关系图(加箭头,连接线文字描述)

【Attention】Visual Attention Network

PHP write a diaper to buy the lowest price in the whole network

【Transformer】SOFT: Softmax-free Transformer with Linear Complexity

Detailed explanation of MySQL statistical function count

Research and implementation of flash loan DAPP

DataX installation

How to obtain openid of wechat applet in uni app project

Activity交互问题,你确定都知道?
随机推荐
【图像分类】如何使用 mmclassification 训练自己的分类模型
D3.JS 纵向关系图(加箭头,连接线文字描述)
【ML】机器学习模型之PMML--概述
mysql插入百万数据(使用函数和存储过程)
【综述】图像分类网络
Detailed explanation of atomic operation class atomicinteger in learning notes of concurrent programming
【Transformer】AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Personal learning website
How to PR an open source composer project
MySql统计函数COUNT详解
Spring, summer, autumn and winter with Miss Zhang (5)
与张小姐的春夏秋冬(5)
Training log 4 of the project "construction of Shandong University mobile Internet development technology teaching website"
Huawei 2020 school recruitment written test programming questions read this article is enough (Part 1)
【语义分割】Mapillary 数据集简介
Spring, summer, autumn and winter with Miss Zhang (4)
Training log II of the project "construction of Shandong University mobile Internet development technology teaching website"
并发编程学习笔记 之 原子操作类AtomicInteger详解
主流实时流处理计算框架Flink初体验。
[pycharm] pycharm remote connection server