当前位置：网站首页>[in depth learning] Introduction to pytorch to project practice (11): convolution layer

[in depth learning] Introduction to pytorch to project practice (11): convolution layer

2022-06-10 04:36:00 【JOJO's data analysis Adventure】

【 Deep learning 】：《PyTorch Introduction to project practice 》（ 11、 ... and ）： Convolution layer

This article is included in 【 Deep learning 】：《PyTorch Introduction to project practice 》 special column , This column mainly records how to use PyTorch Realize deep learning notes , Try to keep updating every week , You are welcome to subscribe ！
Personal home page ：JoJo Data analysis adventure
Personal introduction ： I'm reading statistics in my senior year , At present, Baoyan has reached statistical top3 Colleges and universities continue to study for Postgraduates in Statistics
If it helps you , welcome Focus on 、 give the thumbs-up 、 Collection 、 subscribe special column
Reference material ： This column focuses on bathing God 《 Hands-on deep learning 》 For learning materials , Take notes of your study , Limited ability , If there is a mistake , Welcome to correct . At the same time, Musen uploaded teaching videos and teaching materials , You can go to study .
video ： Hands-on deep learning
The teaching material ： Hands-on deep learning

Please add a picture description

List of articles

【 Deep learning 】：《PyTorch Introduction to project practice 》（ 11、 ... and ）： Convolution layer
Convolutional neural networks （CNN）： Convolution layer implementation
1. introduce
2. Convolution operation
3 Code implementation

Convolutional neural networks （CNN）： Convolution layer implementation

The basic neural network knowledge and some concepts dealing with over fitting and under fitting have been introduced before . Now we formally enter the study of convolutional neural network .CNN yes ⼀ Class strength ⼤ Of 、 For processing image data ⽽ The nerve of design ⽹ Collateral . Based on convolution neural ⽹ The model of network architecture has been dominant in the field of computer vision , Today, ⼏ Almost all image recognition 、⽬ Academic competitions and businesses related to mark detection or semantic segmentation should ⽤ In this way ⽅ Based on law . For computer vision , A major challenge is that the input of data may be very large . for example , We have one 64 $\times$ 64 Pictures of the , Suppose the number of channels is 3, So it has the same amount of data $64\times 64\times 3=12288$ Eigenvector of . When we want to manipulate a larger picture , Convolution calculation is needed , It is a very important part of convolutional neural network .

1. introduce

Let's give you an example , Suppose we were given such a picture
Insert picture description here

Let the computer figure out what's in this picture , The first thing we can do is to detect the vertical edges in the image . for instance , The railing in this picture corresponds to the vertical line , meanwhile , The silhouettes of these pedestrians are also to some extent vertical , These lines are the output of the vertical edge detector . Again , We may also want to detect horizontal edges , For example, these railings are very obvious horizontal lines , So how to detect these edges in the image ？
We can build a $3\times3$ Matrix , We also call it filter or kernel function (kernel). So let's go through Andrew Ng To see why this can do edge detection ？

Insert picture description here

The figure above shows a simple 6×6 Images , The left half is 10, On the right is usually 0. If you think of it as a picture , The part on the left looks white , Pixel values 10 It's a brighter pixel value , The pixel value on the right is darker , I use gray to show 0, Although it can also be painted black . In the picture , There is a particularly prominent vertical edge in the middle of the image , This vertical line is the transition from black to white , Or from white to dark . therefore , When we use one 3×3 When the filter performs convolution , This 3×3 Filter results for , There are bright pixels on the left , Then there's a transition ,0 In the middle , And then on the right is the dark . After convolution , What we get is the matrix on the right . There are many other edge detection methods , About specific computer vision tasks , We'll introduce it later , Next, let's take a look at how convolution is calculated

2. Convolution operation

Let's take a look at how convolution is calculated , Give an input matrix and a kernel function , We will start from the upper left corner of the input feature to find the inner product with the kernel function , Then slide the window , Find the next inner product . Get our output , The specific calculation is as follows
$0\times0+1\times1+3\times2+4\times3 = 19\\ 1\times0+1\times2+4\times2+3\times5 = 25\\ 3\times0+4\times1+6\times2+7\times3 = 37\\ 4\times0+1\times5+2\times7+3\times8 = 43$

Insert picture description here

It can be seen that , After convolution calculation , Our raw data features have become smaller . Suppose the input matrix is $n\times n$ , Kernel function （Kernel） by $f\times f$ , Usually the kernel is in the form of a square matrix . Then the output result is $(n-f+1)\times (n-f+1)$

3 Code implementation

3.1 Let's simply implement convolution

""" Import related libraries """
import torch
from torch import nn

def corr2d(X,K):
    """ Define convolution operations """
    h, w = K.shape# Nuclear shape, Usually here h and w They are equal. 
    Y = torch.zeros((X.shape[0]-h+1, X.shape[1]-w+1))# Initialization output result 
    for i in range(Y.shape[0]):# The second part of the output matrix i That's ok 
        for j in range(Y.shape[1]):# The second part of the output matrix j Column 
            Y[i,j] = (X[i:i+h,j:j+w] * K).sum()# Calculate the corresponding inner product 
    return Y

Let's test whether it is correct

X = torch.Tensor([[0.0,1.0,2.0],[3.0,4.0,5.0],[6.0,7.0,8.0]])
K = torch.Tensor([[0.0,1.0],[2.0,3.0]])
corr2d(X,K)

It can be seen that it is consistent with our previous results .

3.2 Tectonic convolution

Previously we introduced how to construct linear layers , Activation function , as well as drop-out, Allied , We define the convolution layer by defining a class

class Conv2D(nn.Module):
    """ Define two-dimensional convolution """
    def __init__(self, kernel_size):# Define the size of the kernel function , This is a superparameter 
        super().__init__()
        self.weight = nn.Parameter(torch.rand(Kernel_size))# The parameters of learning 
        self.bias = nn.Parameter(torch.zeros(1))# The parameters of learning 
    def forward(self, x):
        return corr2d(x,self.weight)+self.bias# Calculate the convolution result

3.3 Detect image color edges

First we define a X, hypothesis 1 It's for gray ,0 For white

X = torch.ones(6,6)
X[:,2:4] = (0.0)
X

Let's do edge detection , Here we construct a $1\times2$ Kernel function of

K = torch.tensor([[1.0,-1.0]])
K

Y = corr2d(X,K)
Y

among ,1 Indicates a vertical edge from gray to white ,-1 Indicates a vertical edge from white to black

The cores defined above can only detect vertical edges , Now suppose we are right X To transpose , We want to detect horizontal edges

Y = corr2d(X.t(),K)
Y

You can see the kernel function at this time （Kernel） Cannot detect horizontal edges , We need to be right about kernel Also transpose

Y = corr2d(X.t(),K.t())
Y

3.4 Learn convolution kernel

We just defined the convolution kernel ourselves , But when we need to do more complicated calculations , It is difficult to define convolution kernel directly , Let's see if we can learn our convolution kernel from the input and output matrices , Here we define the loss function as Y And the square error of the convolution output .

#  structure ⼀ individual ⼆ Convolution layer , It has 1 Output channels and shapes are （2,2） Convolution kernel 
conv2d = nn.Conv2d(1,1, kernel_size=(1, 2), bias=False)
#  This ⼆ The convolution layer makes ⽤ Four dimensional transmission ⼊ And output format （ Batch ⼤⼩、 passageway 、⾼ degree 、 Width ）,
#  Among them, batch ⼤⼩ And the number of channels are 1
X = X.reshape((1, 1, 6, 6))
Y = Y.reshape((1, 1, 6, 5))
lr = 3e-2 #  Learning rate 
for i in range(10):
    Y_hat = conv2d(X)# Compute convolution 
    l = (Y_hat - Y) ** 2# Loss function 
    conv2d.zero_grad()
    l.sum().backward()# Back propagation calculation gradient 
    #  Iterative convolution kernel 
    conv2d.weight.data[:] -= lr * conv2d.weight.grad
    if (i + 1) % 2 == 0:
        print(f'epoch {
      i+1}, loss {
      l.sum():.3f}')

You can see how 10 After iterations , The error is already low , Let's take a look at the convolution kernel parameters learned

conv2d.weight

The result is the same as what we defined before 1,-1 Basically close to .
This chapter introduces how to calculate convolution and define convolution layer , Follow up padding、 Multidimensional convolution 、 Pool layer, etc .
Insert picture description here
This is the introduction of this chapter , If it helps you , Please do more thumb up 、 Collection 、 Comment on 、 Focus on supporting ！！