当前位置：网站首页>How many convolution methods does deep learning have? (including drawings)

How many convolution methods does deep learning have? (including drawings)

2022-07-03 18:21:00 【ZRX_ GIS】

1 The focus of

How to improve the existing network architecture through the selection of convolution mode

About convolution

2 Why do traditional networks use small convolution instead of large convolution ？（VGG Net）

kernel_size	Adv	exp	dis_Adv
large_kernel_size	The range of perception is large	AlexNet、LeNet And other networks use a relatively large convolution kernel , Such as 5×5,11×11	There are many parameters ; Large amount of computation
small_kernel_size	Less parameters ; A small amount of calculation ; Three nonlinear activation layers are integrated instead of a single nonlinear activation layer , Increase the discrimination ability of the model	VGG after	Insufficient sensory domain ; Deep stack convolution is prone to uncontrollable factors

Insert picture description here

3 Can a fixed size convolution kernel see a larger area ？( Cavity convolution )

The standard 3×3 The convolution kernel can only see the corresponding region 3×3 Size , But in order for the convolution kernel to see a larger range ,dilated conv Make it possible .pooling The information loss caused by down sampling operation is irreversible , This is not conducive to pixel level tasks , Replace with void convolution pooling The role of （ Multiply the receptive field ） It is more suitable for semantic segmentation .
Insert picture description here

4 Must the convolution kernel be a square ？（ Asymmetric convolution ）

Standard 3×3 The convolution is split into a 1×3 Convolution sum 3×1 Convolution , Without changing the size of receptive field, the amount of calculation can be reduced .
Insert picture description here

5 Can convolution only be done in the same set ？( Group convolution & Depth separates the convolution )

Group convolution Is to group the input characteristic graph , Convolution was performed for each group . Assume that the size of the input feature map is CHW(12×5×5), The number of output characteristic graphs is N(6) individual , If the setting is to be divided into G(3) individual groups, Then the number of input characteristic diagrams of each group is C / G ( 4 ), The number of output characteristic diagrams of each group is N/G(2), The size of each convolution kernel is (C/G)KK(4×5×5), The total number of convolution kernels is still N(6) individual , The number of convolution kernels in each group is N/G(2), Each convolution kernel only convolutes with the input characteristic graphs of the same group , The total parameter of convolution kernel is N*(C/G)KK, so , The number of total parameters was reduced to the original 1/G.
Insert picture description here

6 Can packet convolution randomly group channels ？(ShffleNet)

To achieve... Between features Mutual communication , In addition to using dense point wise convolution, You can also use channel shuffle. Yes group convolution Then the characteristic diagram is analyzed “ restructuring ”, This ensures that the following convolution inputs come from different groups , Therefore, information can flow between different groups . chart c This process is further demonstrated , amount to “ Disturb evenly ”.
Insert picture description here

7 Can only one size convolution kernel be used for each layer of convolution ？(Inception)

Traditional cascading networks , It's basically a stack of convolutions , Each layer has only one size convolution kernel , for example V G G A large number of 3×3 Convolution layer . in fact , Same floor feature map Multiple convolution kernels of different sizes can be used separately , In order to obtain features of different scales , Combine these features together , The obtained features are often better than those using a single convolution kernel . In order to reduce the parameters as much as possible , Usually use first 1 * 1 The convolution of maps the characteristic graph to Hidden space , Then do convolution in hidden space .

Insert picture description here

8 Are the features between channels equal ？(SENet)

Whether in the Inception、DenseNet perhaps ShuffleNet Inside , The features we generate for all channels are directly combined regardless of weight , So why think that the characteristics of all channels have equal effects on the model ？ There are often thousands of convolution kernels in a convolution layer , Each convolution kernel corresponds to a feature , So how to distinguish so many features ？ This method is to automatically obtain the importance of each feature channel through learning , Then, according to the calculated importance, enhance the useful features and suppress the features that are not useful for the current task .
Insert picture description here

9 Is the convolution kernel necessarily rectangular ？( Deformable convolution )

Regular shaped convolution kernel （ For example, the general square 3*3 Convolution ） Feature extraction may be limited , If the convolution kernel is given the property of deformation , Let the network according to label The error passed back automatically adjusts the shape of the convolution kernel , Adapt to the region of interest that the network focuses on , You can extract better features . for example , The network will be based on the original location （a）, Learn one offset Offset , A new convolution kernel is obtained （b）（c）（d）, Then some special cases will become special cases of this more generalized model , For example, figure （c） Represents the recognition of objects from different scales , chart （d） Represents the recognition of rotating objects

Insert picture description here

10 Network rewriting ideas

（1）kernel：

First , The large convolution kernel is replaced by several small convolution kernels
secondly , Single size convolution kernel is replaced by Multi Size convolution kernel
also , The deformable convolution is used to replace the fixed shape convolution kernel
or , Add... To the network 1X1 Convolution

（2）channels：

First , Depth separable convolution is introduced
secondly , Introduce packet convolution
also , introduce channel shuffle
or ,feature map weighting

（3）connection

First , introduce skip
secondly , introduce dense, Make each layer blend with the other layers （DenseNet）

11 summary

Insert picture description here

原网站

版权声明
本文为[ZRX_ GIS]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202150241395178.html

当前位置：网站首页>How many convolution methods does deep learning have? (including drawings)

How many convolution methods does deep learning have? (including drawings)

1 The focus of

About convolution

2 Why do traditional networks use small convolution instead of large convolution ？（VGG Net）

3 Can a fixed size convolution kernel see a larger area ？( Cavity convolution )

4 Must the convolution kernel be a square ？（ Asymmetric convolution ）

5 Can convolution only be done in the same set ？( Group convolution & Depth separates the convolution )

6 Can packet convolution randomly group channels ？(ShffleNet)

7 Can only one size convolution kernel be used for each layer of convolution ？(Inception)

8 Are the features between channels equal ？(SENet)

9 Is the convolution kernel necessarily rectangular ？( Deformable convolution )

10 Network rewriting ideas

（1）kernel：

（2）channels：

（3）connection

11 summary

边栏推荐

猜你喜欢

随机推荐