当前位置:网站首页>CBAM for in-depth understanding of the attention mechanism in CV
CBAM for in-depth understanding of the attention mechanism in CV
2022-07-08 02:18:00 【Strawberry sauce toast】
CV Medium Attention Mechanism summary ( Two ):CBAM
CBAM:Convolutional Block Attention Module
Thesis link :CBAM(ECCV 2018)
1. Abstract
1.1 CBAM Summary
Given an intermediate feature map, our module sequentially infers attention maps along two separate dimensions, channel and spatial, then the attention maps are multiplied to the input feature map for adaptive feature refinement.
And SE Different modules ,CBAM Combined with passageway And Space Attention mechanism . The author believes that channel attention determines “what is important", Spatial attention determines "where is important".
1.2 CV in Attention The role of mechanism
Besides , The author in Introduction It concisely expounds Attention The role of mechanism , namely :
Attention not only tells where to focus, it also improves the representation of interests.
Our goal is to increase representation power by using attention mechanism: focusing on important features and suppressing unnecessary ones.
Using attention mechanism can improve the ability of network feature expression .
1.3 CBAM The advantages of modules
CBAM It has the following two advantages :
- And SE comparison , Improved channel attention module , Added spatial attention module ;
- And BAM comparison , Not just for bottleneck in , Instead, it can be used in any intermediate convolution module , It's a plug-and-play( Plug and play ) The attention module .
Two 、 Module details
The CBAM The module is shown in the figure below :
The following is combined with article 3 Section elaborates CBAM Implementation details of the module .
2.1 Channel Attention Module:focusing on “what”
And SE The difference between modules is that , The author added max-pooling operation , also AvgPool And MaxPool Share the same multi-layer perceptron (multi-layer perceptron, MLP) Reduce learnable parameters .
therefore ,CBAM The channel attention extraction of can be expressed by the following formula :
2.2 Spatial Attention Module: focusing on “where”
First , Perform maximum aggregation and average aggregation on the channel dimension respectively , The magnitude is H × W H\times W H×W
Characteristic graph , Then the number of input channels is 2, The number of output channels is 1 The convolution layer extracts spatial attention , The formula is as follows :
2.3 Arrangement of attention modules
Combine channel attention with spatial attention , Get the weighted feature .
The combination sequence and mode of channel attention and spatial attention ( Pictured 1 Shown ):
- Channel in the former , Space behind
- Space comes first , Channel in the
- Serial
- parallel
For the combination sequence and mode , The author proved it by ablation experiment .
2.4 Usage mode
Combination with residual network :
3、 ... and 、PyTorch Realization
import torch
from torch import nn
class ChannelAttentionModule(nn.Module):
def __init__(self, channel, reduction=16):
super(ChannelAttentionModule, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d((1, 1))
self.max_pool = nn.AdaptiveMaxPool2d((1, 1))
self.shared_MLP = nn.Sequential(
nn.Conv2d(channel, channel // reduction, kernel_size=1, stride=1, padding=0, bias=False),
nn.ReLU(inplace=True),
nn.Conv2d(channel // reduction, channel, kernel_size=1, stride=1, padding=0, bias=False)
)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
avg_out = self.shared_MLP(self.avg_pool(x))
max_out = self.shared_MLP(self.max_pool(x))
out = avg_out + max_out
return self.sigmoid(out)
class SpatialAttentionModule(nn.Module):
def __init__(self, kernel_size=7, padding=3):
super(SpatialAttentionModule, self).__init__()
self.conv2d = nn.Conv2d(in_channels=2, out_channels=1,
kernel_size=kernel_size, stride=1, padding=padding, bias=False)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
avg_out = torch.mean(x, dim=1, keepdim=True)
max_out, _ = torch.max(x, dim=1, keepdim=True) # torch.max returns (values, indices)
out = torch.cat([avg_out, max_out], dim=1)
out = self.conv2d(out)
return self.sigmoid(out)
class CBAM(nn.Module):
def __init__(self, channel, reduction, kernel_size, padding):
super(CBAM, self).__init__()
self.channel_attention = ChannelAttentionModule(channel, reduction)
self.spatial_attention = SpatialAttentionModule(kernel_size, padding)
def forward(self, x):
out = self.channel_attention(x) * x
out = self.spatial_attention(out) * out
return out
边栏推荐
- JVM memory and garbage collection-3-runtime data area / heap area
- Nanny level tutorial: Azkaban executes jar package (with test samples and results)
- Height of life
- 直接加比较合适
- 常见的磁盘格式以及它们之间的区别
- The generosity of a pot fish
- Leetcode question brushing record | 27_ Removing Elements
- JVM memory and garbage collection-3-direct memory
- Keras深度学习实战——基于Inception v3实现性别分类
- COMSOL --- construction of micro resistance beam model --- final temperature distribution and deformation --- addition of materials
猜你喜欢
科普 | 什么是灵魂绑定代币SBT?有何价值?
leetcode 869. Reordered Power of 2 | 869. 重新排序得到 2 的幂(状态压缩)
Clickhouse principle analysis and application practice "reading notes (8)
线程死锁——死锁产生的条件
Leetcode question brushing record | 27_ Removing Elements
Introduction to QT: video player
Unity 射线与碰撞范围检测【踩坑记录】
Towards an endless language learning framework
Beaucoup d'enfants ne savent pas grand - chose sur le principe sous - jacent du cadre orm, non, ice River vous emmène 10 minutes à la main "un cadre orm minimaliste" (collectionnez - le maintenant)
#797div3 A---C
随机推荐
发现值守设备被攻击后分析思路
OpenGL/WebGL着色器开发入门指南
If time is a river
Force buckle 4_ 412. Fizz Buzz
【每日一题】648. 单词替换
Introduction to grpc for cloud native application development
Is it necessary for project managers to take NPDP? I'll tell you the answer
nmap工具介绍及常用命令
Nacos microservice gateway component +swagger2 interface generation
The bank needs to build the middle office capability of the intelligent customer service module to drive the upgrade of the whole scene intelligent customer service
牛熊周期与加密的未来如何演变?看看红杉资本怎么说
burpsuite
Flutter 3.0框架下的小程序运行
Height of life
Introduction to ADB tools
Nmap tool introduction and common commands
Principle of least square method and matlab code implementation
Infrared dim small target detection: common evaluation indicators
Towards an endless language learning framework
企业培训解决方案——企业培训考试小程序