当前位置：网站首页>16. Influence of deviation, variance, regularization and learning curve on the model

16. Influence of deviation, variance, regularization and learning curve on the model

2022-07-29 00:28:00 【WuJiaYFN】

primary coverage

Diagnostic bias and variance
Regularization and deviation / variance
The learning curve
Summary of improved algorithm methods
An in-depth understanding of bias and variance

One 、 Diagnostic bias and variance

1.1 There are two cases where the model does not perform well

deviation bias Too big ( High deviation )—— Lead to Under fitting problem
variance variance Too big ( High variance )—— Lead to Over fitting problem

1.2 Judge the high deviation of the model 、 The method of high square error

Usually, the cost function error and polynomial degree of the training set and cross validation set are drawn on a graph to analyze and judge
Graphs use the degree of polynomials d As horizontal axis , In the training set and Calculate separately on the cross validation set J(θ), You will get the following curve ：

1.3 Judgment method and conclusion

We can judge whether the model has high error according to the two curves in the figure ( Under fitting )、 Or high square difference ( Over fitting )：
It is easy to know through images
1. For training sets , When the degree of a polynomial d More hours , The model fitting procedure is lower , The greater the error ; With the degree of polynomial d The growth of , The fitting degree is improved , The error is reduced
2. For cross validation sets , When the degree of a polynomial d More hours , The model fitting procedure is low , The greater the error ; With the degree of polynomial d The growth of , The error first decreases and then increases , The turning point is when the model begins to fit the training data set
Specific conclusions ：
1. The left side of the figure knows , When the degree of a polynomial d More hours , The error of training set and cross validation set is very large , Explain the under fit （ That is to say When training set error and cross validation set error are approximate —— Under fitting ）
2. The right side of the figure knows , When the degree of a polynomial d large , The training set error is very small , The error of cross validation set is much larger than that of training set , Explain over fitting （ That is to say When the error of cross validation set is much greater than that of training set —— Over fitting ）

Two 、 Regularization and deviation / variance

2.1 The influence of regularization choice

When we're training models , Usually use some Regularization method to prevent over fitting
But the degree of regularization （ That is, the selected λ The value of ） Too small or too small will also cause over fitting of the model / Under fitting problem
Usually , We choose to test λ Value , It's usually 0-10 The presentation between 2 The value of the multiple relation

2.2 for example ： Consider regularized linear regression model

When λ Too big ,θ Will become very small after being punished 、 Close to the 0, Finally, the equation is only θ0 This one , Become a straight line , Cause high deviation bias、 Under fitting
When λ Too small , Regular terms don't work , Lead to high square difference variance、 Over fitting

2.3 Choose the right one λ Methods

Use the training set to practice 12 A different degree of regularization model
use 12 Each model calculates the cross validation error for the cross validation set
Select the model with the smallest cross validation error
Use steps 3 Test the test set with the model selected in

Insert picture description here

2.4 λ For the cost function （ Model ） Impact analysis of

Sum the cost function error of the training set and the cross validation set model λ The value of is plotted on the same curve , Get the following figure
Conclusion ：
1. It can be seen from the left side of the image , When λ Very hour , The error of training set is small , The error of cross validation set is much larger than that of training set , Explain over fitting
2. It can be seen from the right side of the image , When λ When a large , The error of training set is increasing , The error of cross validation set decreases first and then increases , Both errors are very large , Explain the under fit

3、 ... and 、 The learning curve

3.1 The concept of learning curve

The learning curve It is a good tool , Using the learning curve, we can judge whether a learning algorithm is in deviation 、 The problem of variance
The learning curve Is a good learning algorithm Reasonable inspection
The learning curve It is a curve drawn by taking the error of the training set and the error of the cross validation set as a function of the number of instances of the training set

Insert picture description here

3.2 Use the learning curve to judge the high deviation / Under fitting

We use a straight line （ As an example ） To adapt to the data , see , No matter how big the error of the training set is, there will be no big change to the cost function model
That is to say High deviation / Under fitting Under the circumstances , Add data to the training set Does not improve the model

3.3 Use learning curve to judge high variance / Over fitting

Suppose we use a very high degree polynomial model , And the regularization is very small , It can be seen that , When the error of cross validation set is much larger than that of training set , Adding more data to the training set error can improve the effect of the model
That is to say High variance / Over fitting Under the circumstances , Add more data to the training set May improve Model algorithm effect

Four 、 Summary of improved algorithm methods

4.1 Six ways to debug an algorithm

Get more training data —— Solve the problem of high square difference
Try to reduce the number of features —— Solve the problem of high square difference
Try more features —— Solve the problem of high deviation
Try adding polynomial features —— Solve the problem of high deviation
Try to reduce the degree of regularization λ —— Solve the problem of high deviation
Try increasing the degree of regularization λ —— Solve the problem of high square difference

4.2 The influence of neural network size on the model

Using smaller neural networks , Similar to the case of less parameters , It is easy to cause high deviation and under fitting , But the calculation cost is small
Using larger neural networks , Similar to the case of more parameters , It is easy to cause high square error and over fitting , Although the calculation costs a lot , But it can be adjusted by regularization to make the algorithm more suitable for data
Generally, it is better to choose a larger neural network and use regularization than a smaller neural network

4.3 The influence of hidden layers in neural network on the model

For the selection of the number of hidden layers in the neural network , Usually from the beginning of a gradual increase in the number of layers , In order to make a better choice , You can divide the data into training sets 、 Cross validation sets and test sets , And train neural networks with different hidden layers , Then select the neural network with the least cost of cross validation set

5、 ... and 、 An in-depth understanding of bias and variance

5.1 The concept of deviation and variance

deviation ： It describes the gap between the expected value of the predicted value and the real value . The bigger the deviation , The more it deviates from the real data set
variance ： It describes the range of changes in the predicted values , The degree of dispersion , That is, the distance from the expected value . The greater the variance , The distribution of prediction result data is more scattered

5.2 Based on deviation / Error of variance

Error based on deviation ： It is the difference between the expected prediction of the model and the real value to be predicted . Deviation is used to measure the difference between the prediction of the model and the true value .
Error based on variance ： The error based on variance describes the variability of a model to predict a given data . such as , When you repeat the process of building a complete model many times , Variance is how much it changes between different relationships in the prediction model .

5.3 make a concrete analysis

Insert picture description here

Top left ： Low deviation bias, Low variance variance. The accuracy of prediction results is very high , And the model is relatively robust （ Stable ）, The prediction results are highly concentrated .
Top right ： Low deviation bias, High variance variance. The accuracy of prediction results is high , But the model is unstable , The prediction results are divergent .
Bottom left ： High deviation bias, Low variance variance. The accuracy of prediction results is low , But the model is stable , The prediction results are relatively concentrated .
The accuracy of the results is very high , And the model is relatively robust （ Stable ）, The prediction results are highly concentrated .
Top right ： Low deviation bias, High variance variance. The accuracy of prediction results is high , But the model is unstable , The prediction results are divergent .

If you think the article is good , You can give me some praise and encourage me , Welcome friends to collect and study
Pay attention to me , Let's study together , Progress together ！！！

原网站

版权声明
本文为[WuJiaYFN]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/210/202207282232308597.html