当前位置：网站首页>Introduction to convolutional neural network

Introduction to convolutional neural network

2022-06-13 01:07:00 【dddd_ jj】

If you are interested in convolutional neural networks, you can b Stand and look `Li Hongyi` Teacher's video , Every time I see it, I will have some new understanding .

Next is my understanding of convolutional neural networks .

First of all , You need to understand what the input to your task is , Then the convolutional neural network is a `function` , Process the input of the task , Output what you want . Consider the convolutional neural network as a function , Then this function will have `Convolution operation 、 Pooling operation 、 Tiling operation 、 Full connection operation` etc. .

Convolution operation ：

Here's the picture ,66 The matrix of is the data we input , then 33 Of Filter 1 It's ours Convolution kernel , We first operate on the convolution kernel corresponding to the input matrix , Input the upper left corner of the matrix 3*3 And are multiplied by each other and added up , Is the so-called convolution operation .
Like in the picture below , The first result 3, Namely 1 * 1 +（-1）*0 +（-1）*0 +（-1）*0 + 1 * 1 +（-1）*0 + 0 *（-1）+ 0 *（-1）+1 * 1 , The result is 3. In this way , Convolution kernel then performs convolution operation with the next part of the input matrix , Output another number . In the figure stride by 1, That is, the next part of the input matrix moves only one element to the right at a time in the current operation .

Pooling operation

Insert picture description here
The following figure is the result of the above convolution check input matrix operation , You can calculate the result by yourself , See if it is the same as the following figure , If the same , It means that you have learned the convolution operation .

Pooling operation ：

Yes Maximum pooling 、 The average pooling Equal pooling operation .
The maximum pool operation is in the current matrix , And then block , Get the maximum value in each piece , Then put the maximum value in each piece together , Get the final result .
The following figure shows how to maximize the pool , It's the picture 2.
The difference between average pooling and maximum pooling is , Maximum pooling is to keep the maximum value in each block , Average pooling is to get the average value of each block .

chart 1 Insert picture description here
chart 2

Tiling operation ：

That is, the matrix after convolution pooling and other operations （ Convolution pooling can be operated many times ） Tile into a sequence . For example, after the above matrix is tiled , It's going to be like this .
Insert picture description here

Full connection operation ：

Insert picture description here
The above figure is after the above tiling operation , Become a sequence of elements , Then you can choose the number and size of the output elements , But full connection means that every input element and output element are connected （ I remember that the concept of full connection seems to be discrete mathematics or something that has been taught in some class , I should understand this meaning ） The formula is as follows ：

Output elements = Activation function （ coefficient * Input elements + bias ）

the second , You have to set up the model loss function .

loss Function is a function to judge whether your model is good or not , common loss Function you can search by yourself . Common is MSE

Third , To find the optimal parameters of the model

In fact, we need to find the optimal parameters in convolutional neural networks , Let's first look at the parameters that need to be optimized .

1. Convolution layer

The specific value of convolution kernel 、 bias ( After convolution , Plus the bias )

2. Pooling layer

3. Fully connected layer

The weight

The above parameters are first Random initialization , Then after a batch of training , We need to adjust the parameters , bring loss Function minimum . The roughest way is to enumerate all the parameter combinations , But that has to be reckoned till the end of time , So we should take some methods to quickly find the optimal parameters , The commonly used method for adjusting parameters is gradient descent 、adam wait , It can make the parameters converge to the optimal parameters faster .

For specific gradient descent methods, check the blog , Or take a look at teacher lihongyi's video

That's the basic thing cnn Concept , It all depends on personal understanding , If there is a mistake , You can point out in the comment area .

原网站

版权声明
本文为[dddd_ jj]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202280556150297.html