当前位置:网站首页>Superficial understanding of CNN
Superficial understanding of CNN
2022-07-30 00:17:00 【OPTree412】
1.CNN是怎么工作的
CNNIt was originally a model created by the author to imitate human vision.The features of a part of the image are identified by a special convolution.比如,You distinguish birds from cats by looking at the characteristics of these animals,Bird's beak and wings,Cat ears and paws.It is very effective to recognize objects by these features.
1.1 Basic understanding of calculation methods
首先,Set up a convolution(The smaller square in the middle of the image below),然后让这个Convolution picks a patch on the image,Multiply all elements of this block and sum,After calculating a picture, go from left to right,Calculate from top to bottom.9个元素变成1个元素,Turn a big picture into a small piece.
The dynamic calculation method can refer to this figureCNNComputational dynamic graph
The appearance of convolution extraction features can refer to the following figure,Gradually abstract some features in the picture into what the computer understands.
1.2 详细的CNN细节(基于Pytorch)
1.2.1 nn.Conv2d 主要结构
This section introduces the commonly used parameters,For other details, please see the explanation on the official website
| 参数 | 参数类型 | 意义 |
|---|---|---|
| in_channels | int | 输入的四维张量[Batch_size, Channels, H, W]中的Channels输入(图像通道数).这个形参是确定权重等可学习参数的shape所必需的. |
| out_channels | int | It can be simply understood as inputting a picture into convolution,Convolution can outputout_channels个小图片. |
| kernel_size | int or tuple | 卷积核大小,如果要的是5x5、3x3This kind of convolution kernel with the same left and right numbers can input a number.如果要3x5这种,则需要输入(3,5). |
| stride | int or tuple, optional | 卷积步长,默认为1.The convolution moves across the image across several grids. |
| padding | int or tuple, optional | 填充操作,默认为0.Padding即所谓的图像填充,后面的int型常数代表填充的多少(行数、列数).需要注意的是这里的填充包括图像的上下左右,以padding = 1为例,若原始图像大小为5x5,那么padding后的图像大小就变成了7x7,而不是6x6.(You can refer to the dynamic diagram above) |
1.2.2 The image size calculation formula after convolution
2. 卷积神经网络
2.1 Neurons of Convolutional Neural Networks
Convolution is in operation,It doesn't really use a convolution kernel to move step by step like the dynamic image above,那样太慢了.
实际上,每一个神经元(That is, a convolution),Responsible for an area.下一个区域(如动态图,The area after the step movement),It is another neuron that is responsible for the calculation.
如下图,The input image is6x6x3,Nothing else is required and the step size is just that1.Then there are neurons16个神经元,The size of each convolution is 3x3x3.
注:After each convolution you set the size,The depth is the same as your input image.例如,输入彩色图像,The depth of your convolution is just that3层;输入灰度图像,The depth of the convolution is 1层.
2.2 卷积神经网络的特点
1、局部感知
This feature is from above1.1The third image in the .
优点:
对于一个 1000∗1000 的输入图像而言,If the number of neurons in the next hidden layer is 106 个,There is a full connection1000∗1000∗106=1012 个权值参数,Such a huge number of parameters is almost difficult to train.
Instead, local connections are used,Each neuron in the hidden layer is only associated with the image 10∗10的局部图像相连接,Then the number of weight parameters at this time is 10∗10∗106=108,will be directly reduced4个数量级.compared to fully connected neural networks,大大降低了运算量.
2、权重共享
如果一个特征在计算某个空间位置 (x1,y1)的时候有用,那么它在计算另一个不同位置 (x2,y2)(x2,y2) 的时候也有用.A neuron that prevents a neural network from having a lot of repetitive functions,This leads to shared parameters.

3. 激活函数与池化层
3.1 激活函数
一般都是用ReLu,The advantage is that it converges quickly,求梯度简单.
3.2 池化层
Take the average or maximum of an area(最小)值.The main purpose is to reduce the amount of computation,But as the computing power is getting higher and higher,The pooling layer is not very useful.
4. 关于卷积神经网络
在自然语言处理领域,CNNThe input is usually a word or sentence represented as a vector matrix.
卷积层是CNN中的重要组成部分,Each node input in a convolutional layer is part of the previous neural network layer,Its purpose is to extract different features of the input image or text.Convolutional layers when dealing with text sequence problems,Different features in text sequences are usually extracted using filters of different sizes.
池化层It is to reduce the input dimension of the network model,Thereby reducing the complexity of the network model,Reduce the entire model parameters,Make the neural network model more robust,At the same time, it can effectively prevent the model from overfitting to a certain extent.One of the most common pooling methods is 最大池化 ( Max - Pooling ) 和平均池化( Average Pooling).
CNNIt is usually added after the convolutional layer and the pooling layer全连接层,This layer can transform high dimensions into low dimensions,Also keep useful information.Usually convolutional layers、The components of the pooling layer are regarded as the process of automatically extracting features,在特征提取完成之后,The output layer is required for classification or prediction tasks.
The learned high-dimensional feature representation is generally fed to the output layer,通过 Softmax 函数The probability that the current sample belongs to a different class can be calculated.
参考:
1.TextCNN 的 PyTorch 实现
2.什么是卷积神经网络中的-----“神经元”以及“连接数”
3. CNN笔记:通俗理解卷积神经网络
边栏推荐
猜你喜欢

News text classification

Go日志库——logrus

Codeforces Round #805 (Div. 3) Summary

Worthington Optimized Technology: Cell Quantification

旋转数组的最小数字

vim相关介绍(二)

EA & UML Sun Arch - State Diagram :: Redraw Button State Diagram

vim相关介绍(三)

Vmtouch - under Linux file cache management artifact

Worthington Enzymatic Cell Harvest & Cell Adhesion and Harvest
随机推荐
Worthington优化技术:细胞定量
News text classification
call、apply 以及 bind 的区别和用法
opencv基本图像的滤波
利用热点事件来创作软文的3大技巧?自媒体人必看
vmtouch——Linux下的文件缓存管理神器
Adaptive feature fusion pyramid network for multi-classes agriculturalpest detection
How to design and implement report collaboration system for instruction set data products——Development practice of industrial collaborative manufacturing project based on instruction set IoT operating
某团实习面经
2022年企业直播行业发展洞察
I.MX6U-驱动开发-3-新字符驱动
Reading notes. This is the psychology: see through the essence of the pseudo psychology (version 10)"
Unity Addressables
i.MX6U-driver development-3-new character driver
直播平台搭建,设置状态栏颜色
EA&UML日拱一卒-多任务编程超入门-(7)关于mutex,你必须知道的
CesiumJS 2022^ 源码解读[0] - 文章目录与源码工程结构
Decision tree principle and code implementation
转发和重定向的区别及使用场景
Worthington Dissociation Enzymes: Trypsin and Frequently Asked Questions