当前位置：网站首页>Deep learning - networks in networks and 1x1 convolution

Deep learning - networks in networks and 1x1 convolution

2022-06-30 07:43:00 【Hair will grow again without it】

1x1 Convolution

In terms of architecture content design , One of the more helpful ideas is to use 1×1 Convolution .

1x1 Convolution ？

The filter for 1×1, Here are the numbers 2, Enter a 6×6×1 Pictures of the , And then convolute it , The size of the lifting filter is 1×1×1, The result is equivalent to multiplying this picture by a number 2, So the first three cells are 2、4、6 wait . use 1×1 The filter is convoluted , It doesn't seem to be very useful , Just multiply the input matrix by a number . But this is only for 6×6×1 A channel image of ,1×1 Convolution effect is not good .

If it's a 6×6×32 Pictures of the , So use 1×1 The filter performs better convolution . say concretely ,1×1 The function of convolution is to traverse this 36 A cell , Calculate... In the left picture 32 Number and filter 32 The sum of the elements of a number , Then apply ReLU Nonlinear functions . This 1×1×32 In the filter 32 A number can be understood in this way , The input of a neuron is 32 A digital （ Enter the position of the lower left corner of the picture 32 Numbers in channels ）, That is to say, on the same height and width 32 A digital , this 32 Numbers have different channels , multiply 32 A weight （ Put... In the filter 32 The number is understood as the weight ）, Then apply ReLU Nonlinear functions , Output the corresponding result here . Generally speaking , If there is more than one filter , It's more than one. , It's like having multiple input units , Its input content is all numbers on a slice , The output is 6×6 Number of filters .

therefore 1×1 Convolution can be fundamentally understood as For this 32 A full connection layer is applied to all different locations , The role of the full connection layer is to input 32 A digital （ The number of filters is marked 𝑛𝐶[𝑙+1], Here 36 Repeat on units ）, The output is 6×6×#filters（ Number of filters ）, In order to implement a nontrivial on the input layer （non-trivial） Calculation . This method is usually called 1×1 Convolution , Sometimes called Network in Network.

example

Suppose this is a 28×28×192 The input layer of , You can use the pooling layer to compress its height and width , This process is very clear to us . But if the number of channels is large , The How to compress it into 28×28×32 The dimension layer ？ you It can be used 32 Size is 1×1 Filter , Strictly speaking, the size of each filter is 1×1×192 dimension , Because the number of channels in the filter must be the same as the number of channels in the input layer . But you used 32 A filter , The output layer is 28×28×32, This is the number of compression channels （𝑛𝑐） Methods , For the pool layer I just compressed the height and width of these layers .