当前位置：网站首页>Ml image depth learning and convolution neural network

Ml image depth learning and convolution neural network

2022-07-25 15:53:00 【sword_ csdn】

Catalog

Reference resources

Huawei cloud College

Convolutional neural networks

Convolutional neural networks （Convolutional Neural Network,CNN） It's a feedforward neural network , It includes （convolution）, Pooling layer （pooling layer） And full connection layer （fully connected layer）.
20 century 60 years ,Hubel and Wiesel When studying the neurons used for local sensitivity and direction selection in cat brain cortex, we found that its unique network structure can effectively reduce the complexity of feedforward neural network , Then convolutional neural network .
Insert picture description here

Convolution operation

Insert picture description here

Convolution kernel calculation demonstration

Insert picture description here

Concepts in convolution Networks

Convolution kernel （convolution kernel）： An object that performs image scanning and convolution calculation according to certain rules , It can be used to extract local features .
Convolution kernel size （kernel size）： The convolution kernel is generally a 3 D matrix , It can be represented by a cube ,width,height,deep.deep It can be understood as a channel channel.
Characteristics of figure （feature map）： The result matrix obtained after convolution kernel calculation is the characteristic graph . Each convolution kernel will get a layer of characteristic graph .
Dimension of characteristic drawing （feature map size）： The characteristic graph is a 3 D matrix ,width,height,deep. The depth is determined by the number of convolution kernels of the current layer .
step （stride）： The span of convolution kernel sliding on the input image . If the convolution kernel moves one pixel at a time , Then the step size is 1
Zero fill （zero padding）： In order to extract the edge information of the image , And ensure that the size of the output feature map meets the requirements , You can fill the edge of the input image with an all 0 Border , The pixel width of the border is padding

The core idea of convolutional neural network

Local awareness . Generally speaking, people's cognition of the outside world is from local to global , And the spatial connection of the image is also the local pixel connection . So convolutional neural network first senses the local , Then integrate the parts to get the global information .
Parameters of the Shared . For the entered photos , Use one or more filter Scan photos , Its own parameter is weight w, Use the same filter Scan the entire image and w unchanged , Parameter sharing . Such as the 3 individual filter, Every filter Will scan the whole picture , and filter The parameter value of is fixed , That is, all elements of the whole graph “ share ” It's the same w.

Convolutional neural network structure

Input layer ： For data input
Convolution layer （convolution layer）： Each convolution layer in convolution neural network is composed of several convolution units , The parameters of each convolution unit are optimized by back propagation algorithm . The purpose of convolution is to extract different features of input , Each convolution layer intelligently extracts some low-level features , Like the edge 、 Lines and corners, etc , More layers of networks can extract more complex features iteratively from low-level features .
Activation function （activation function）： Linearize the output of the convolution , The most commonly used activation function is ReLU. It is not recorded as a single layer .
Pooling layer （pooling layer）： Reduce image features （feature map） Space size , Reduce the number of training parameters
Fully connected layer （fully connected layer）： Combine all local features into global features , It is generally used to calculate the score of each category , Play the role of classifier , Generally used softmax Activate the function to quantify the final output .
Output layer ： Output final results .

ILSVRC

ImageNet Large Scale Visual Recognition Challenge It's a technology competition held by Stanford city . since 2010 Since then , From year to year ILSVRC Both include the following 3 Games ： Image classification , Single object positioning , Object detection .
Insert picture description here

AlexNet

Insert picture description here

VGGNet

Insert picture description here
VGG Of 6 Configuration

Google's GoogLeNet

Insert picture description here
GoogLeNet Medium Inception structure

This parallel structure

Residual network of Microsoft ResNet

Insert picture description here
ResNet Residual structure in

The idea of residual structure is to connect the input and output of convolution , This can effectively improve “ The gradient disappears ” The problem of .