当前位置：网站首页>[pytorch] 2.4 convolution function nn conv2d

[pytorch] 2.4 convolution function nn conv2d

2022-07-01 09:08:00 【Enzo tried to smash the computer】

Two dimensional convolution should be the most commonly used convolution method , stay Pytorch Of nn Module , Encapsulates the nn.Conv2d() Class as an implementation of two-dimensional convolution . The usage method is the same as that of ordinary classes , Instantiate first and then use . Here is a neural network with only one layer of two-dimensional convolution , As nn.Conv2d（） Introduction to the use of the method ：

class Net(nn.Module):
    def __init__(self):
        nn.Module.__init__(self)
        self.conv2d = nn.Conv2d(in_channels=3,out_channels=64,kernel_size=4,stride=2,padding=1)
 
    def forward(self, x):
        print(x.requires_grad)
        x = self.conv2d(x)
        return x
    
print(net.conv2d.weight)
print(net.conv2d.bias)

Its formal parameters are Pytorch The manual can be found , The first three parameters must be provided manually , The following ones have default values . Next, I'll introduce ：
Insert picture description here

in_channels

It's easy to understand , Is the input four-dimensional tensor [N, C, H, W] Medium C 了 , That is, the input tensor channels Count . This parameter is used to determine the weight and other learnable parameters shape Necessary .

out_channels

It's easy to understand , Of the desired four-dimensional output tensor channels Count , Don't say more .

kernel_size

The size of the convolution kernel , We usually use 5x5、3x3 This convolution kernel with the same number on the left and right , So in this case, just write kernel_size = 5 That's all . If the left and right numbers are different , such as 3x5 Convolution kernel , So writing kernel_size = (3, 5), Note that you need to write a tuple, Instead of writing a list （list）.

stride = 1

The interval between each translation of convolution kernel on the image window , The so-called step size . This concept and Tensorflow Other frameworks make no difference , No more words .

padding = 0

Pytorch And Tensorflow The biggest difference in the implementation of convolution layer is padding On .
Padding The so-called image filling , hinder int The type constant represents how much is filled （ Row number 、 Number of columns ）, The default is 0. It should be noted that the filling here includes the top, bottom, left and right of the image , With padding = 1 For example , If the original image size is 32x32, that padding The size of the image becomes 34x34, instead of 33x33.

Pytorch differ Tensorflow The point is ,Tensorflow What is offered is padding The pattern of , such as same、valid, And different modes correspond to different output image size calculation formulas . and Pytorch You need to manually enter padding The number of , Of course ,Pytorch The advantage of this implementation is that the calculation formula of output image size is unique , namely

Insert picture description here

dilation = 1

This parameter determines whether to use Cavity convolution , What is void convolution , Put two pictures and you will understand . The default is 1（ Do not use ）.
Insert picture description here

groups = 1

Control the connection between input and output .group=1, The output is the convolution of all the inputs ;group=2, At this time, it is equivalent to having two convolution layers side by side , Each convolution layer computes half of the input channel , And the output is half of the output channel , Then connect the two outputs .