当前位置:网站首页>Analysis of neural network
Analysis of neural network
2022-07-02 21:33:00 【caiggle】
Preface : The study of neural networks dates back to 20 century 40 The age has already begun , Today, it has formed a huge system and has the characteristics of interdisciplinary .
List of articles
One 、 Neuron model
Because neural network has the characteristics of interdisciplinary , So there are different definitions of neural network . We use the most extensive definition , by Kohonen On 1988 Put forward in : neural network (neural networks) It is a widely parallel interconnected network composed of adaptive simple units , Its organization can simulate the interaction of biological nervous system with real world objects .
Let's take a look at the enduring M-P Neuron model .
For the sake of understanding , I will give a simple explanation to this model :n Each neuron produces its own input x1、x2······xn, Each has a connection weight w1、w2······wn, Input value and received neuron threshold seta Compare , The output is generated by activating the function .
Common activation functions :
1. Step function


x= linspace(0,1,101);
y = [zeros(1,50),ones(1,51)];
plot(x,y);
2.sigmoid function


x= [-100:0.1:100];
y=1./(1+exp(-x) );
plot(x,y);
3.tanh function
…… In addition, there are many activation functions, which are not listed here . It should be pointed out that the unit step function is our ideal activation function , But because it is not very smooth , The nature is not very good , In most cases, we use other functions instead of unit activation functions .
Two 、 Perceptron and multilayer feedforward neural network
In order to better understand perceptron and multilayer network , We can first establish logic and 、 or 、 Not 、 Exclusive or 、 Or not 、 And the concept of non .
And : When all input conditions are met at the same time 1, Output 1; Input conditions as long as 0 The output 0.
or : The input conditions are 1 The output 1; One 1 Only when there is no output 0.
Not : The output result is negative to the input condition . namely 0 by 1,1 by 0.
Exclusive or : For two input gates , The input conditions are the same as 0, Different for 1
Or not : For two input gates , Neither of the two input conditions is 1 Time output 1; Otherwise output 0.
And non : When the input conditions are 1 Instead of output 0.
1. perceptron
(1) Definition
The perceptron consists of two layers of neurons , That is, the input layer receives the external signal and passes it to the output layer .
(2) Work
We know 
Assume f Is the unit step function , By controlling the weight and threshold, the logical and or non operation can be realized . So it's easy to think of , How to determine the weight and threshold ?
The answer is “ Study ”. In fact, the threshold can be regarded as a fixed input -1 Of “ Dumb node ”, The corresponding connection weight is Wn+1, In this case “ Study ” It is equivalent to the learning of weight . follow “ Correct the mistake as soon as you know it ” Learning rules , namely :

among η(0<η<1) Become learning rate , The perceptron adjusts according to the error degree of the estimated value .
2. Multilayer feedforward neural network
actually , The perceptron has only one layer of functional neurons , That is, only the output layer performs activation function processing , Limited ability , The ability to solve logical problems such as XOR is insufficient .
First, let's talk about linear separability and nonlinear separability :
There is no strict definition here , Linear separability means that two kinds of patterns can be separated by a linear hyperplane , On the contrary, it is nonlinear separable .
For the XOR problem , Yes :
Only two linear hyperplanes can be used to divide the two classes , such , Our original perceptron with two layers of neurons is about to expand the number of layers , Develop into multilayer neural network , The incoming layer is between the input layer and the output layer , Called hidden layer , Then there are single hidden layer feedforward networks and double hidden layer feedforward networks . If there is no ring or loop in the network topology , Then it is called multilayer feedforward neural network .
3、 ... and 、BP Algorithm
To develop Multilayer Neural Networks , There must be strong algorithm support , After all, the perceptron “ Correct the mistake as soon as you know it ” The rule of type learning is too simple . Let's take a look at the most successful neural network algorithm so far —— Error back propagation method .
For specific derivation , There are different ways , The starting point is also different , I use the weight from hidden layer to output layer to deduce 1h, It's very complicated , It's easy to get the symbols and subscripts wrong .
recommend :[https://blog.csdn.net/u010858605/article/details/69857957]
About accumulation BP With the standard BP:
First of all, we should be clear about ,BP The goal of the algorithm is to minimize the cumulative error on the training set , But the standard BP The algorithm only updates the weight for a single training sample at a time , Updates appear more frequent , Processing is more complex , Time is also longer , And different updates may offset each other . The cumulative BP The algorithm aims at minimizing the cumulative error , Faster processing speed , But in some cases , It is difficult to reduce the cumulative error after it is reduced to a certain extent , Standard at this time BP The algorithm may get a better solution .
Four 、 For the solution of some problems
1. Over fitting problem
Because the function of neural network is too powerful , It often encounters fitting problems , It refers to the high fitting of the model to the training set , But the test error of the test set is increasing . So what's the solution ?
(1) Stop early
The data set is divided into training set and verification set , If the training set error decreases but the verification set error increases, then stop training , At the same time, the connection weight and threshold with the minimum verification set error are returned
(2) Regularization
Regularization methods are also different , But the basic ideas are consistent , That is to add a part to the error objective function to describe the complexity of the network , The error is determined by the weighted sum of empirical error and network complexity .
In fact, generally speaking, regularization is to make the parameter matrix sparse , Dilute or ignore the influence of certain characteristics , Therefore, the over fitting phenomenon is alleviated .
2. Jump out of local minimum
We want to find a suitable set of parameters to make the error objective function achieve the global minimum , This is a parameter optimization process . We know , The global minimum must be the local minimum , The local minimum is not necessarily the global minimum . Sometimes we may fall into local minima , This problem needs to be solved .
(1) Simulated annealing
(2) Stochastic gradient descent
(3) Genetic algorithm (ga)
·······
In addition, we need to point out that , These algorithms are heuristic algorithms , Lack of mathematical guarantee .
5、 ... and 、 Other common neural networks
1.RBF The Internet
2.ART The Internet
3.SOM The Internet
4.Elman The Internet ( Recursive neural network )
5. Cascading related networks
neural network 、 machine learning 、 Deep learning …… The development is really too fast , Continuous updating ing
边栏推荐
- Research and Analysis on the current situation of China's clamping device market and forecast report on its development prospect
- Report on investment development and strategic recommendations of China's vibration isolator market, 2022-2027
- In depth research and investment feasibility report on the global and China active vibration isolation market 2022-2028
- I did a craniotomy experiment: talk about macromolecule coding theory and Lao Wang's fallacy from corpus callosum and frontal leukotomy
- Research Report on market supply and demand and strategy of China's plastic pump industry
- Research Report on the overall scale, major manufacturers, major regions, products and applications of micro hydraulic cylinders in the global market in 2022
- Research Report on micro vacuum pump industry - market status analysis and development prospect prediction
- Talk about macromolecule coding theory and Lao Wang's fallacy from the perspective of evolution theory
- Construction and maintenance of business websites [6]
- China's Micro SD market trend report, technology dynamic innovation and market forecast
猜你喜欢

Go language learning summary (5) -- Summary of go learning notes

Investment strategy analysis of China's electronic information manufacturing industry and forecast report on the demand outlook of the 14th five year plan 2022-2028 Edition

It is said that this year gold three silver four has become gold one silver two..

MySQL learning notes (Advanced)

Huawei Hongmeng watch achieves fireworks display effect on New Year's Eve
![[shutter] statefulwidget component (pageview component)](/img/0f/af6edf09fc4f9d757c53c773ce06c8.jpg)
[shutter] statefulwidget component (pageview component)

How does esrally perform simple custom performance tests?

Welfare, let me introduce you to someone

7. Build native development environment

I did a craniotomy experiment: talk about macromolecule coding theory and Lao Wang's fallacy from corpus callosum and frontal leukotomy
随机推荐
In depth research and investment feasibility report of global and Chinese isolator industry, 2022-2028
Chinese Indian seasoning market trend report, technical dynamic innovation and market forecast
Analysis of enterprise financial statements [4]
[fluent] dart generic (generic class | generic method | generic with specific type constraints)
JDBC | Chapter 3: SQL precompile and anti injection crud operation
Sword finger offer (I) -- handwriting singleton mode
This team with billions of data access and open source dreams is waiting for you to join
[shutter] shutter layout component (Introduction to layout component | row component | column component | sizedbox component | clipoval component)
China's log saw blade market trend report, technological innovation and market forecast
7. Build native development environment
Research Report on micro vacuum pump industry - market status analysis and development prospect prediction
JDBC | Chapter 4: transaction commit and rollback
Research Report on the overall scale, major manufacturers, major regions, products and applications of micro hydraulic cylinders in the global market in 2022
How does esrally perform simple custom performance tests?
Research Report on the overall scale, major manufacturers, major regions, products and applications of building automation power meters in the global market in 2022
Baidu sued a company called "Ciba screen"
AMD's largest transaction ever, the successful acquisition of Xilinx with us $35billion
rwctf2022_ QLaaS
kernel_ uaf
I did a craniotomy experiment: talk about macromolecule coding theory and Lao Wang's fallacy from corpus callosum and frontal leukotomy