当前位置：网站首页>Neural network convolution layer

Neural network convolution layer

2022-07-01 04:45:00 【booze-J】

article

pytorch Convolution layer official document
pytorch Conv2d Official documents
The example code is as follows ：

import torch
import torchvision
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("CIFAR10",train=False,transform=torchvision.transforms.ToTensor(),download=True)
#  Be careful dataset in transform The parameter receives an object , So we need to add parentheses , In addition, when using neural network for operation, the data type required is tensor type , therefore transforms Parameters to add .
dataloader = DataLoader(dataset,batch_size=64)

#  Build a simple network 
class Booze(nn.Module):

    #  Inherit nn.Module The initialization 
    def __init__(self):
        super().__init__()
        #  Note that here is to create a global variable, so we need to add a self  When out_channels Far greater than in_channels The original image needs to be expanded , That is to say padding The value of cannot be set to 0 了 , According to the formula 
        
        self.conv1 = Conv2d(in_channels=3,out_channels=6,kernel_size=(3),stride=1,padding=0)

    #  rewrite forward function 
    def forward(self,x):
        x = self.conv1(x)
        return x


#  Initialize the network 
obj = Booze()
#  Check out the Internet 
print(obj)
''' Booze( (conv1): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1)) ) '''

writer = SummaryWriter("logs")
step = 0

for data in dataloader:
    imgs,targets = data
    output = obj(imgs)
    # torch.Size([64, 3, 32, 32]) 64 Zhang 3 passageway 32X32 Pictures of the 
    print(imgs.shape)
    # torch.Size([64, 6, 30, 30]) 64 Zhang 6 passageway 30X30 Pictures of the 
    print(output.shape)

    #  Use tensorboard visualization   Note that multiple images are to be used add_images instead of add_image
    writer.add_images("input",imgs,step)

    #  because output yes 6 The number of channels cannot be displayed , Direct visualization will report an error , So we need to deal with output Conduct reshape reshape When a number is unknown in the second parameter of , You can fill in -1, He will automatically help you calculate , Why is it unknown ？ Because I just don't know how much to fill , fill 64 I'm sure not , Then changing the number of channels is equivalent to cutting out the extra pixels 
    torch.reshape(output,(-1,3,30,30))
    writer.add_images("output",output,step)
    step+=1

writer.close()

Points in the code that need attention and explanation ：

dataset = torchvision.datasets.CIFAR10("CIFAR10",train=False,transform=torchvision.transforms.ToTensor(),download=True)

Be careful dataset in transform The parameter receives an object , So we need to add parentheses , In addition, when using neural network for operation, the data type required is tensor type , therefore transforms Parameters to add .

#  Build a simple network 
class Booze(nn.Module):

    #  Inherit nn.Module The initialization 
    def __init__(self):
        super().__init__()
        #  Note that here is to create a global variable, so we need to add a self  When out_channels Far greater than in_channels The original image needs to be expanded , That is to say padding The value of cannot be set to 0 了 , According to the formula 

        self.conv1 = Conv2d(in_channels=3,out_channels=6,kernel_size=(3),stride=1,padding=0)

    #  rewrite forward function 
    def forward(self,x):
        x = self.conv1(x)
        return x

In building neural network inheritance nn.Module When initializing , Creating variables creates global variables , So you need to add a before the variable self.

self.conv1 = Conv2d(in_channels=3,out_channels=6,kernel_size=(3),stride=1,padding=0)

Be careful When out_channels Far greater than in_channels The original image needs to be expanded , That is to say padding The value of cannot be set to 0 了 , It needs to be calculated according to the formula , The formula as follows ：
Insert picture description here
Above picture input The meaning of the four elements in the following tuple ：

The first element represents batch_size
The second element represents the number of image channels
The third element represents the height of the image matrix
The fourth element represents the width of the image matrix

for data in dataloader:
    imgs,targets = data
    output = obj(imgs)
    # torch.Size([64, 3, 32, 32]) 64 Zhang 3 passageway 32X32 Pictures of the 
    print(imgs.shape)
    # torch.Size([64, 6, 30, 30]) 64 Zhang 6 passageway 30X30 Pictures of the 
    print(output.shape)

    #  Use tensorboard visualization   Note that multiple images are to be used add_images instead of add_image
    writer.add_images("input",imgs,step)

    torch.reshape(output,(-1,3,30,30))
    writer.add_images("output",output,step)
    step+=1

writer.close()

In the above code writer.add_images("output",output,step) Before running, you need to output Reduce the number of channels . because output yes 6 The number of channels cannot be displayed , Direct visualization will report an error , So we need to deal with output Conduct reshape .
torch.reshape(output,(-1,3,30,30))reshape When a number is unknown in the second parameter of , You can fill in -1, He will automatically help you calculate , Why is it unknown ？ Because I just don't know how much to fill , fill 64 I'm sure not , Then changing the number of channels is equivalent to cutting out the extra pixels .

Use... After the above code is run tensorboard See the effect ：
Insert picture description here
As can be seen from the picture above output Every step More pictures than input Every step Number of pictures , The reason is that 6 Number of channels picture reshape become 3 The number of channels is caused by pictures batch_size An increase in .