当前位置：网站首页>[channel attention mechanism] senet

[channel attention mechanism] senet

2022-07-28 15:46:00 【Coke Daniel】

Catalog

summary
details
- SE Block
Code implementation

summary

Author's speech ： link
SENet The core of this paper is the application of channel attention mechanism SE Block, Explicitly model the interdependencies between feature channels , Let the network automatically learn the importance of each channel , Then, according to this importance, improve the useful features , Useless inhibition （ Feature recalibration strategy ）. Similar to what I learned before attention-unet Medium attention-gate, It is also a plug and play structure .

details

SE Block

CNN The core of is convolution , It will be spatially （spatial） In terms of information and characteristics （channel-wise） Aggregate information , Previous studies have tried to improve the performance of the network from the spatial dimension , This paper attempts to consider from the feature dimension , Put forward SE Block, As shown in the figure below ：
Insert picture description here
First of all Squeeze operation , Along channel Dimension for feature compression , Compress each two-dimensional feature image into a real number , This real number has a global receptive field to some extent , And the dimension of output matches the number of characteristic channels of input .
The second is Excitation operation , Generate a weight for each two-dimensional feature graph , This weight is used to show the importance of declaring the current channel .
And finally a Reweight The operation of , We will Excitation The weight of the output of is weighted to the previous feature channel by channel through multiplication , Complete the recalibration of the original feature on the channel dimension .

In order to SE Block The embedded ResNet For example ：
Insert picture description here
actual Squeeze Operation is global average pooling ,Excitation Operation is two full connection layers , Reduce the number of channels on the first floor , The other layer is responsible for restoring the original state , The last activation function uses sigmod, Get the weight coefficient , Then multiply the coefficient by the input , Get the new characteristic graph after considering the importance .

Use two layers of full connection instead of one ：
1） It has more nonlinearity , It can better fit the complex correlation between channels ;
2） It greatly reduces the amount of parameters and calculation .

Code implementation

Realization 1

import paddle
import paddle.nn as nn

#  Use full connection layer , because Linear It deals with the last dimension , So we need to tensor Additional processing of dimensions 
class SELayer1(nn.Layer):
    def __init__(self,in_channels,reduction=16):
        super(SELayer1, self).__init__()
        self.avg_pool=nn.AdaptiveAvgPool2D(1)
        self.fc=nn.Sequential(
            nn.Linear(in_channels,in_channels // reduction),
            nn.ReLU(),
            nn.Linear(in_channels // reduction, in_channels),
            nn.Sigmoid()
        )

    def forward(self,x):
        # x:[n,c,h,w]
        #  We don't need to w,h  Because after the pooling operation  1*1 了 
        n, c, _, _=x.shape
        y=self.avg_pool(x).flatten(1) # y:[n,c*h*w]=[n,c]
        y=self.fc(y).reshape([n,c,1,1]) # y:[n,c,1,1]
        #  It's OK not to do this step   The framework will broadcast automatically 
        y=y.expand_as(x) # y:[n,c,h,q]
        out=x*y
        return out

Realization 2

#  Use 1*1 Instead of the full connection layer   Avoided tensor Additional processing of dimensions 
class SELayer2(nn.Layer):
    def __init__(self,in_channels,reduction=16):
        super(SELayer2, self).__init__()
        self.squeeze =nn.AdaptiveAvgPool2D(1)
        self.excitation=nn.Sequential(
            nn.Conv2D(in_channels, in_channels // reduction, 1, 1, 0),
            nn.ReLU(),
            nn.Conv2D(in_channels // reduction, in_channels, 1, 1, 0),
            nn.Sigmoid()
        )

    def forward(self,x):
        # x:[n,c,h,w]
        y=self.squeeze(x)
        y=self.excitation(x)
        out=x*y
        return out