当前位置:网站首页>SENet detailed explanation and Keras reproduction code

SENet detailed explanation and Keras reproduction code

2022-08-04 07:03:00 hot-blooded chef

原作者开源代码:https://arxiv.org/pdf/1709.01507.pdf

代码:https://github.com/hujie-frank/SENet

1、feature relationship between channels

近些年来,卷积神经网络在很多领域上都取得了巨大的突破.而卷积核作为卷积神经网络的核心,通常被看做是在局部感受野上,将空间上(spatial)的信息和特征维度上(channel-wise)的信息进行聚合的信息聚合体.卷积神经网络由一系列卷积层、非线性层和下采样层构成,这样它们能够从全局感受野上去捕获图像的特征来进行图像的描述.

我们可以看到,已经有很多工作在空间维度上来提升网络的性能.那么很自然想到,网络是否可以从其他层面来考虑去提升性能,比如考虑特征通道之间的关系?The authors of the paper are based on this and proposeSqueeze-and-Excitation Networks(简称 SENet).The author does not want to introduce a new dimension for fusion between feature channels,Instead, a brand new one is used特征重标定策略.简单来说,By adding a branch,Automatically acquired to each channel重要程度,Then use this level of importance to promote useful information,It also suppresses features that are not very useful for the current task.
se block

上图是SE模块的示意图.给定一个输入X,its channel dimensionC’,After a series of convolution and other transformations, a channel dimension is obtainedC的特征.The next structure is somewhat similarResNet,但又与ResNet有很大不同.

首先是Squeeze操作,我们顺着空间维度来进行特征压缩,Transform each 2D feature channel into a real number,This real number has a global receptive field to some extent,And the number of output and input channels is the same.它表征着在特征通道上响应的全局分布,而且使得靠近输入的层也可以获得全局的感受野,这一点在很多任务中都是非常有用的.

其次是Excitation操作,He is similarRNNThe mechanism of the middle door,通过参数W来为每个特征通道生成权重,其中参数Wwill be used to controlU中每个通道的重要性.

最后一个是Reweight的操作,我们将ExcitationThe output weight of is regarded as the importance of each feature channel after feature selection,It is then weighted channel-by-channel to the original features by multiplicationU上,Complete feature recalibration in the channel dimension.

2、具体的网络结构

由于SEModules are not like thatGoogLeNet和ResNet一样,A new network structure is proposed,Therefore, it can be flexibly embedded into the existing mainstream network.

SE network

On the left of the picture above is the generalSE模块嵌入到Inception结构的一个示例.

这里的Global pooling对应着Squeeze操作,It compresses the dimension of the input feature layer to 1 x 1 x C.Then two fully connected layers form oneBottleneck结构去建模通道间的相关性,And the dimension information of the final output remains unchanged,为1 x 1 x C.

We can see that the first fully connected layer is usedReLU作为激活函数,The second layer is adoptedSigmoid的作为激活函数.而我们知道SigmoidThe variables are mapped to0,1之间,That is, useful features passSEModules make him more inclined1了,At the same time useless features are also closer0了,Then pass the lastScale操作,Multiplying the output weights by each channel of the original feature also gains useful features of the original channel,Useless features are suppressed.

The reason why two fully connected layers are used instead of one is:

  1. 通过ReLUMore nonlinearities can be obtained
  2. 引入rParameters can reduce the amount of parameters and calculations

除此之外,SEModules can also be embedded in networks with cross-layer connections,On the right side of the picture above is the generalSEAn example of embedding into a module,原理、操作基本和SE-Inception一样,Only at the endAddition前对分支上Residualfeatures are recalibrated.

目前大多数的主流网络都是基于这两种类似的单元通过 repeat 方式叠加来构造的.由此可见,SE 模块可以嵌入到现在几乎所有的网络结构中.通过在原始网络结构的 building block 单元中嵌入 SE 模块,我们可以获得不同种类的 SENet.如 SE-BN-Inception、SE-ResNet、SE-ReNeXt、SE-Inception-ResNet-v2 等等.

3、实验结果

论文中给出了ResNet、ResNeXtWait for the network comparison results that were more common at that time.Deep learning has developed over several years,There are also new breakthroughs in data enhancement,为了公平起见,The authors reimplemented the network again and used the same data augmentation method.结果如下图所示:

SE result

Combined with the test results and the above introduction,我们可以发现SENetThe construction is very simple,No need to introduce new functions or layers.Compare with the original network,Just need to increase2%-10%的参数,error can be reduced0.4-1.1左右.

4、更多的尝试

参数r的调节

SE ratio

We introduced parameters in the first full joinr,This reduces the number of channels in the first fully connected layer,The whole presents a bottleneck-like structure.The authors' test results found thatr=8There will be a better effect.

Pooling的方式

SE pooling
The compression method for the spatial dimension,作者尝试了Global Max Pooling和Global Average Pooling两种方式,无论是top-1还是top-5,结果都表明,AvgPool效果会更好.

激活函数的选择

SE activation
Next is the selection of the activation function for the last fully connected layer,用tanh替换sigmoidWill slightly degrade performance,而使用ReLUwill deteriorate significantly,实际上会导致SE-ResNet-50的性能低于ResNet-50基线.这表明,为了使SE块有效,The choice of activation function is important.

SE Block添加的位置

SE position
作者还对SEThe location where the module is added is compared,在top-5The top is further backStageadd onSEThe module is higher than the previous oneStageThe effect of adding on is better,Of course if at allStageBoth add effects are the best.

SE的四种结构

SE structure
最后是对SEThe structure of the modules is compared:

​ (a) Ordinary residuals
​ (b) 标准的SE模块
​ 先进行SEResiduals again
​ (d) Complete the residual calculation before proceedingSE模块计算
​ (e) Done on cross-layer connectionsSE模块计算

SE struct res
从结果上看,是SE-PREstructure is slightly better,But personally I think it's justResNet-50The comparison is not convincing.But no matter what the structure will improveResNet的准确率,说明SEModules are functional.

5、总结

SENet作为ImageNetThe final image recognition champion of the competition,The author has put in a lot of time,A lot of attempts and experiments have been done on the network architecture,There are also some details in the paper that are not explained in this article,Readers can download the paper to read in detail.The author's open source code is based oncaffe的,The author also tried itkeras和tf2上进行复现,The more intuitive effect is that the confidence of the classification is higher than the originalResNet要高很多.可能是因为在SE-ResNet中进行了多次sigmoid函数激活.

原网站

版权声明
本文为[hot-blooded chef]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/216/202208040527316680.html