当前位置：网站首页>Detailed explanation of conv2d -- use in arrays and images

Detailed explanation of conv2d -- use in arrays and images

2022-06-30 23:16:00 【Philo`】

conv2d Detailed explanation -- Use in arrays and images

1、 Environmental requirements

1、 Need to install Pytorch rely on
2、 Official documents conv2d
3、 Picture needs CIFAR10 Data sets

2、 Principle explanation

The original 2D data , Operate through convolution kernel , Get the result of the operation , Specific operation steps ：
Insert picture description here

Through convolution kernel , Overwrite input data , Multiply the selected data and add them , Then you get the output data

Calculate repeatedly till the end , Get the output

Only when the convolution kernel is safely covered on the original image , But you can also continue to move around , Not completely covered , As long as there is coverage, it can be calculated , Make up for the extra space 0 that will do ;
Here is also left and right up and down, moving one grid by one , You can also move two spaces at a time ;
The two situations mentioned above , yes conv2d Medium padding Parameters and stride The parameter is not the default value

3、 Function requirements

The function prototype ：
Insert picture description here
Parameter requirements ：

The data required to be entered on the latest official website is int That's it , This is for picture data , In array data , need tensor data type , See the following examples for detailed differences

The input requirement is tensor data type , And need minibatch And input channels , The original two-dimensional array has no , Need to use reshape To transform
Convolution kernel is also the same requirement

3、 Example use

3.1、 Array

Code ：


import torch
import torchvision
import torch.nn.functional as F


#  input data 
input = torch.tensor([[1,2,0,3,1],
                      [0,1,2,3,1],
                      [1,2,1,0,0],
                      [5,2,3,1,1],
                      [2,1,0,1,1]])

print(" original input shape",input.shape)   # torch.Size([5, 5])
input = torch.reshape(input,(1,1,5,5))    #  Format conversion , Add the first two parameters ,batchsize=1,channel=1, The data is 5*5 torch.Size([1, 1, 5, 5])
print("torch.shape After shape",input.shape)

#  Convolution kernel 
kernel = torch.tensor([[1,2,1],
                       [0,1,0],
                       [2,1,0]])

kernel = torch.reshape(kernel,(1,1,3,3))

#  The default convolution uses ,padding=0,stride=1
output1 = F.conv2d(input,kernel)
print(" Default convolution ",output1)

# padding = 1,stride = 1
output2 = F.conv2d(input,kernel,padding = 1,stride = 1)
print("padding = 1,stride = 1",output2)

# padding =1,stride = 2
output3 = F.conv2d(input,kernel,padding =1,stride = 2)
print("padding =1,stride = 2",output3)

Output ：
Insert picture description here

3.2、 picture

Code ：

import torch
from torch import nn
from torch.nn import Conv2d
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter


# download=False, The data set here has been downloaded , You don't have to download it every time you run , Can be the first time , Change it to True Download 
# "./datasetvision" Is the storage path 
# transform=torchvision.transforms.ToTensor()  Use picture data torchvision Format conversion 
dataset = torchvision.datasets.CIFAR10("./datasetvision",train=False,
                                       transform=torchvision.transforms.ToTensor(),download=False)

#  Data preprocessing ,batch_size=64 Indicates that the number of data obtained each time is 64 Zhang 
dataloader = DataLoader(dataset,batch_size=64)


#  Simple neural network definition 
class ConNet(nn.Module):
    def __init__(self):
        super(ConNet, self).__init__()
        #  Input channel   Because it's a color image RGB  So the input channel is 3 layer , Output 6 layer , The convolution layer is 3*3
        self.conv2d = Conv2d(in_channels=3,out_channels=6,kernel_size=3,stride=1,padding=0)

    # Define the specific function body 
    def forward(self,x):
        result = self.conv2d(x)
        return result


Work = ConNet()
print(Work)  #  Print out the neural network structure ： ConNet((conv2d): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1)))

#  Use tensorboard Name the folder 
write = SummaryWriter("logsConv2d")

# data yes dataloader Tuples in 
step = 0
for data in dataloader:
    imgs,target = data

    # print(" original image ",imgs.shape) # The difference between before and after 
    # print(output.shape)
    write.add_images("input",imgs,step)  #  Put the initial image in tensorboard Contrast 

    output = Work(imgs)  #  Perform image convolution 
    output = torch.reshape(output,(-1,3,30,30))   #  This is because of convolution , Define the output channel as 6 Channels ,board I don't know how to show , So use reshape convert 
    write.add_images("output",output,step)
    step = step+1

write.close()