当前位置：网站首页>Superficial understanding of CNN

Superficial understanding of CNN

2022-07-30 00:17:00 【OPTree412】

1.CNN是怎么工作的

CNNIt was originally a model created by the author to imitate human vision.The features of a part of the image are identified by a special convolution.比如,You distinguish birds from cats by looking at the characteristics of these animals,Bird's beak and wings,Cat ears and paws.It is very effective to recognize objects by these features.

1.1 Basic understanding of calculation methods

首先,Set up a convolution（The smaller square in the middle of the image below）,然后让这个Convolution picks a patch on the image,Multiply all elements of this block and sum,After calculating a picture, go from left to right,Calculate from top to bottom.9个元素变成1个元素,Turn a big picture into a small piece.
在这里插入图片描述
The dynamic calculation method can refer to this figureCNNComputational dynamic graph
请添加图片描述

The appearance of convolution extraction features can refer to the following figure,Gradually abstract some features in the picture into what the computer understands.
在这里插入图片描述

1.2 详细的CNN细节（基于Pytorch）

1.2.1 nn.Conv2d 主要结构

This section introduces the commonly used parameters,For other details, please see the explanation on the official website
在这里插入图片描述

参数	参数类型	意义
in_channels	int	输入的四维张量[Batch_size, Channels, H, W]中的Channels输入(图像通道数).这个形参是确定权重等可学习参数的shape所必需的.
out_channels	int	It can be simply understood as inputting a picture into convolution,Convolution can outputout_channels个小图片.
kernel_size	int or tuple	卷积核大小,如果要的是5x5、3x3This kind of convolution kernel with the same left and right numbers can input a number.如果要3x5这种,则需要输入(3,5).
stride	int or tuple, optional	卷积步长,默认为1.The convolution moves across the image across several grids.
padding	int or tuple, optional	填充操作,默认为0.Padding即所谓的图像填充,后面的int型常数代表填充的多少（行数、列数）.需要注意的是这里的填充包括图像的上下左右,以padding = 1为例,若原始图像大小为5x5,那么padding后的图像大小就变成了7x7,而不是6x6.（You can refer to the dynamic diagram above）

1.2.2 The image size calculation formula after convolution

2. 卷积神经网络

2.1 Neurons of Convolutional Neural Networks

Convolution is in operation,It doesn't really use a convolution kernel to move step by step like the dynamic image above,那样太慢了.

实际上,每一个神经元（That is, a convolution）,Responsible for an area.下一个区域（如动态图,The area after the step movement）,It is another neuron that is responsible for the calculation.

如下图,The input image is6x6x3,Nothing else is required and the step size is just that1.Then there are neurons16个神经元,The size of each convolution is 3x3x3.

注：After each convolution you set the size,The depth is the same as your input image.例如,输入彩色图像,The depth of your convolution is just that3层;输入灰度图像,The depth of the convolution is 1层.
在这里插入图片描述

2.2 卷积神经网络的特点

1、局部感知
This feature is from above1.1The third image in the .
优点：
对于一个 1000∗1000 的输入图像而言,If the number of neurons in the next hidden layer is 106 个,There is a full connection1000∗1000∗106=1012 个权值参数,Such a huge number of parameters is almost difficult to train.
Instead, local connections are used,Each neuron in the hidden layer is only associated with the image 10∗10的局部图像相连接,Then the number of weight parameters at this time is 10∗10∗106=108,will be directly reduced4个数量级.compared to fully connected neural networks,大大降低了运算量.

2、权重共享
如果一个特征在计算某个空间位置 (x1,y1)的时候有用,那么它在计算另一个不同位置 (x2,y2)(x2,y2) 的时候也有用.A neuron that prevents a neural network from having a lot of repetitive functions,This leads to shared parameters.

在这里插入图片描述

3. 激活函数与池化层

3.1 激活函数

一般都是用ReLu,The advantage is that it converges quickly,求梯度简单.
在这里插入图片描述

3.2 池化层

Take the average or maximum of an area（最小）值.The main purpose is to reduce the amount of computation,But as the computing power is getting higher and higher,The pooling layer is not very useful.
在这里插入图片描述

4. 关于卷积神经网络

在自然语言处理领域,CNNThe input is usually a word or sentence represented as a vector matrix.

卷积层是CNN中的重要组成部分,Each node input in a convolutional layer is part of the previous neural network layer,Its purpose is to extract different features of the input image or text.Convolutional layers when dealing with text sequence problems,Different features in text sequences are usually extracted using filters of different sizes.

池化层It is to reduce the input dimension of the network model,Thereby reducing the complexity of the network model,Reduce the entire model parameters,Make the neural network model more robust,At the same time, it can effectively prevent the model from overfitting to a certain extent.One of the most common pooling methods is 最大池化 ( Max － Pooling ) 和平均池化( Average Pooling).

CNNIt is usually added after the convolutional layer and the pooling layer全连接层,This layer can transform high dimensions into low dimensions,Also keep useful information.Usually convolutional layers、The components of the pooling layer are regarded as the process of automatically extracting features,在特征提取完成之后,The output layer is required for classification or prediction tasks.

The learned high-dimensional feature representation is generally fed to the output layer,通过 Softmax 函数The probability that the current sample belongs to a different class can be calculated.

参考：
1.TextCNN 的 PyTorch 实现
2.什么是卷积神经网络中的-----“神经元”以及“连接数”
3. CNN笔记：通俗理解卷积神经网络

原网站

版权声明
本文为[OPTree412]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/211/202207300009199169.html