当前位置:网站首页>Neural network learning (IV) -- a simple summary of the knowledge of each layer of neural network
Neural network learning (IV) -- a simple summary of the knowledge of each layer of neural network
2022-06-09 03:10:00 【Red date oats】
**
Convolution layer
**
Why does the number of channels increase after convolution ?
answer : The number of convolution kernels , Stack after convolution , As shown in the figure below 
Neural network convolution layer is used to extract features , But what features are extracted ?
I am not very clear about this problem , I can only explain my humble opinion , If you know Jimei , I'd love to hear from you in the comments section , learn from each other .
Personally think that : Convolution feature extraction differ Image angle feature extraction , It is more of a convolution operation . The feature extraction of convolution is aimed at the pixels of the image , For example, a RGB Figure our human eyes can clearly know the content of the image , But the computer only knows that it is red 、 green 、 The blue channel is composed of three pixels . Convolution needs to use convolution to check the pixels of three channels for convolution operation , The final convolution result is obtained by continuously adjusting the convolution kernel parameters , At this point, we can understand the features contained in the image .
There are many convolution operations , Here's one :
For this 4×4 Image , We use two 2×2 The convolution kernel of . Set step size to 1, That is to say, every time 2*2 The fixed window slides one unit to the right .
Feature_map Calculation formula of dimensions :[ ( Original image size - Convolution kernel size )/ step ] + 1
Then it goes up , Why can convolution extract features ?
After calculating by the first convolution kernel feature_map It's a three-dimensional data , The absolute value in the third column is the largest , It shows that there is a vertical feature in the corresponding place of the original picture , That is, the pixel value changes greatly ; And through the calculation of the second convolution kernel , The value in the third column is 0, The second line has the largest absolute value , It shows that there is a horizontal feature in the corresponding place of the original picture .
**
Feel the field
**
In the field of machine vision, there is a concept called receptive field in deep neural network , It is used to represent the receptive range of neurons at different positions in the network to the original image . The reason why neurons cannot perceive all the information of the original image , It is because convolution layer and pooling layer , The layers are locally connected ( adopt sliding filter). The larger the value of the receptive field of a neuron, the larger the range of the original image it can touch , It also means that he may have a bigger picture 、 Higher semantic features ; The smaller the value is, the more local and detailed the features are . Therefore, the value of receptive field can be roughly used to judge the abstraction level of each layer .
**
Network degradation
**
for instance , Suppose you have an optimal network structure , yes 18 layer . When we design the network structure , We don't know how many layers of networks are optimal network structure , Suppose you designed 34 Layer network structure . So many 16 Layer is actually redundant , We want to train the network , The model can train these five layers to be identity mapping , That is, when passing through this layer, the input and output are exactly the same . But it is often difficult for the model to put this 16 The parameter learning of layer identity mapping is correct , Then it will not be better than optimization 18 Layer network structure has good performance , This is as the network depth increases , The model will degenerate . It is not produced by over fitting , It is caused by the redundant network layer learning, not the parameters of identity mapping .
Network degradation solution : Using a residual network resnet , The brief introduction is as follows :(F(x) For the residuals )



Be careful !!! CNN The convolution layer parameters of the first several layers in the network account for a small proportion , The amount of calculation accounts for a large proportion ; The back full connection layer is just the opposite , Most of the CNN All networks have this characteristic . Therefore, when we carry out computational acceleration optimization , Focus on convolution ; Carry out parameter optimization 、 Weight clipping , Focus on the full connectivity layer .
**
Pooling layer
**( Also called undersampling or undersampling , Reduce the dimension of features )
It is divided into maximum pooling and average pooling , Maximum pooling is to find the maximum value in the window , Average pooling is to find the average value in the window .
Purpose : Compress the input feature graph , On the one hand, make the feature map smaller , Simplify network computing complexity ; One is feature compression , Extract the main features .

**
Zero filling layer
**
If you keep adding layers, the size of the picture becomes smaller and smaller , At this time, add the zero filling layer (Zero Padding).
**
Fully connected layer
**
Connect all features , Send the output value to the classifier ( Commonly used softmax classifier , Calculate the classification probability ).

General structure of convolution network :
**
Feature fusion :
**
Many works improve the performance of detection and segmentation by fusing multi-layer features , According to the order of fusion and prediction , It is classified as early fusion (Early fusion) And late fusion (Late fusion).
Early fusion (Early fusion): First merge the features of multiple layers , Then the predictor is trained on the fused features ( Only after full integration , We're going to test it in a unified way ). This kind of method is also called skip connection, using concat、add operation . The representative of this idea is Inside-Outside Net(ION) and HyperNet.
Two classic feature fusion methods :
(1)concat: Fusion of series features , Connect the two features directly . Two input features x and y If the dimension of is p and q, Output characteristics z The dimension of is p+q;
(2) add: Parallel strategy , Combine these two eigenvectors into a complex vector , For input features x and y,z = x + iy, among i Imaginary units .
Late fusion (Late fusion): The detection performance is improved by combining the detection results of different layers ( Before the final fusion , Detection starts at the partially fused layer , There will be multiple layers of detection , Finally, multiple detection results are fused ). There are two kinds of ideas for this kind of research :
(1)feature No integration , Multiscale feture Forecast separately , Then we synthesize the prediction results , Such as Single Shot MultiBox Detector (SSD) , Multi-scale CNN(MS-CNN)
(2)feature Pyramidal fusion , Forecast after fusion , Such as Feature Pyramid Network(FPN) etc. .
边栏推荐
- ERP starts from internal integration
- Basic method of missing data filling (1) -- k-nearest neighbors (KNN) filling
- Go技术日报(2022-06-07)——go程序员开发效率神器汇总
- Typescript base type - type assertion
- Ccf-csp 201803-3 URL mapping 100 points
- ERP overview
- Go Technology Daily (June 7, 2022) - go programmer development efficiency artifact summary
- 创建对象之 new 和 newInstance() 的区别
- Ccf-csp 202109-3 pulse neural network, sometimes 100 points..
- Which securities firm should be selected for stock account opening? Is it safe to open an account
猜你喜欢

Leetcode 713. Subarray double pointers whose product is less than k

What taxes do Tami dogs need to pay for equity transfer? What are the possible tax risks?

RTSP/Onvif协议视频平台EasyNVR对静态文件大小的优化

Optimization of static file size by rtsp/onvif protocol video platform easynvr

现在VB6.0已经和SQL连接了,但是使用查询功能时无法做到任意条件查询,网上的情况和我的也不太相符,请问该如何实现呢?

Go Technology Daily (June 7, 2022) - go programmer development efficiency artifact summary

What does this SQL question mean
![[homeassistant Internet access (cpolar)]](/img/37/063986d7d855a1803a7a1a108f7134.png)
[homeassistant Internet access (cpolar)]

Reflection principle and application in C #

Discussion on MLIR Technology
随机推荐
À propos de JS console. Log () est un problème de synchronisation ou asynchrone
神经网络学习(六)----深度学习与机器学习的关系理解
Which securities firm should be selected for stock account opening? Is it safe to open an account
关于我的那些事【我的2022】
Leetcode 560. And is the prefix and of the subarray of K
Ccf-csp 202109-3 pulse neural network, sometimes 100 points..
Ccf-csp 201409-3 string matching
Leetcode 713. Subarray double pointers whose product is less than k
2003 -can t connect to MySQL server on localhost (10061 "unknown error")
Calendar time operation
STM32 flash erase crash
独家 | 维信金科申请消费金融牌照未果
Leetcode 974. And K divisible subarray prefix sum
Common commands for detecting Huawei network devices
qt项目添加编译报错选项
QT project add compile error reporting option
Rcgi column - region of Overseas Social Market Research (including lottery)
Ccf-csp 201903-4 messaging interface
TypeScript 基础类型 —— 类型断言
Ccf-csp 201412-3 call auction