当前位置:网站首页>16. Influence of deviation, variance, regularization and learning curve on the model
16. Influence of deviation, variance, regularization and learning curve on the model
2022-07-29 00:28:00 【WuJiaYFN】
primary coverage
- Diagnostic bias and variance
- Regularization and deviation / variance
- The learning curve
- Summary of improved algorithm methods
- An in-depth understanding of bias and variance
One 、 Diagnostic bias and variance
1.1 There are two cases where the model does not perform well
deviation bias Too big ( High deviation )—— Lead to Under fitting problem
variance variance Too big ( High variance )—— Lead to Over fitting problem

1.2 Judge the high deviation of the model 、 The method of high square error
Usually, the cost function error and polynomial degree of the training set and cross validation set are drawn on a graph to analyze and judge
Graphs use the degree of polynomials d As horizontal axis , In the training set and Calculate separately on the cross validation set J(θ), You will get the following curve :

1.3 Judgment method and conclusion
We can judge whether the model has high error according to the two curves in the figure ( Under fitting )、 Or high square difference ( Over fitting ):
It is easy to know through images
- For training sets , When the degree of a polynomial d More hours , The model fitting procedure is lower , The greater the error ; With the degree of polynomial d The growth of , The fitting degree is improved , The error is reduced
- For cross validation sets , When the degree of a polynomial d More hours , The model fitting procedure is low , The greater the error ; With the degree of polynomial d The growth of , The error first decreases and then increases , The turning point is when the model begins to fit the training data set
Specific conclusions :
- The left side of the figure knows , When the degree of a polynomial d More hours , The error of training set and cross validation set is very large , Explain the under fit ( That is to say When training set error and cross validation set error are approximate —— Under fitting )
- The right side of the figure knows , When the degree of a polynomial d large , The training set error is very small , The error of cross validation set is much larger than that of training set , Explain over fitting ( That is to say When the error of cross validation set is much greater than that of training set —— Over fitting )

Two 、 Regularization and deviation / variance
2.1 The influence of regularization choice
- When we're training models , Usually use some Regularization method to prevent over fitting
- But the degree of regularization ( That is, the selected λ The value of ) Too small or too small will also cause over fitting of the model / Under fitting problem
- Usually , We choose to test λ Value , It's usually 0-10 The presentation between 2 The value of the multiple relation
2.2 for example : Consider regularized linear regression model
When λ Too big ,θ Will become very small after being punished 、 Close to the 0, Finally, the equation is only θ0 This one , Become a straight line , Cause high deviation bias、 Under fitting
When λ Too small , Regular terms don't work , Lead to high square difference variance、 Over fitting

2.3 Choose the right one λ Methods
- Use the training set to practice 12 A different degree of regularization model
- use 12 Each model calculates the cross validation error for the cross validation set
- Select the model with the smallest cross validation error
- Use steps 3 Test the test set with the model selected in

2.4 λ For the cost function ( Model ) Impact analysis of
Sum the cost function error of the training set and the cross validation set model λ The value of is plotted on the same curve , Get the following figure

Conclusion :
- It can be seen from the left side of the image , When λ Very hour , The error of training set is small , The error of cross validation set is much larger than that of training set , Explain over fitting
- It can be seen from the right side of the image , When λ When a large , The error of training set is increasing , The error of cross validation set decreases first and then increases , Both errors are very large , Explain the under fit
3、 ... and 、 The learning curve
3.1 The concept of learning curve
- The learning curve It is a good tool , Using the learning curve, we can judge whether a learning algorithm is in deviation 、 The problem of variance
- The learning curve Is a good learning algorithm Reasonable inspection
- The learning curve It is a curve drawn by taking the error of the training set and the error of the cross validation set as a function of the number of instances of the training set

3.2 Use the learning curve to judge the high deviation / Under fitting
We use a straight line ( As an example ) To adapt to the data , see , No matter how big the error of the training set is, there will be no big change to the cost function model

That is to say High deviation / Under fitting Under the circumstances , Add data to the training set Does not improve the model
3.3 Use learning curve to judge high variance / Over fitting
Suppose we use a very high degree polynomial model , And the regularization is very small , It can be seen that , When the error of cross validation set is much larger than that of training set , Adding more data to the training set error can improve the effect of the model

That is to say High variance / Over fitting Under the circumstances , Add more data to the training set May improve Model algorithm effect
Four 、 Summary of improved algorithm methods
4.1 Six ways to debug an algorithm
- Get more training data —— Solve the problem of high square difference
- Try to reduce the number of features —— Solve the problem of high square difference
- Try more features —— Solve the problem of high deviation
- Try adding polynomial features —— Solve the problem of high deviation
- Try to reduce the degree of regularization λ —— Solve the problem of high deviation
- Try increasing the degree of regularization λ —— Solve the problem of high square difference
4.2 The influence of neural network size on the model
- Using smaller neural networks , Similar to the case of less parameters , It is easy to cause high deviation and under fitting , But the calculation cost is small
- Using larger neural networks , Similar to the case of more parameters , It is easy to cause high square error and over fitting , Although the calculation costs a lot , But it can be adjusted by regularization to make the algorithm more suitable for data
- Generally, it is better to choose a larger neural network and use regularization than a smaller neural network
4.3 The influence of hidden layers in neural network on the model
- For the selection of the number of hidden layers in the neural network , Usually from the beginning of a gradual increase in the number of layers , In order to make a better choice , You can divide the data into training sets 、 Cross validation sets and test sets , And train neural networks with different hidden layers , Then select the neural network with the least cost of cross validation set
5、 ... and 、 An in-depth understanding of bias and variance
5.1 The concept of deviation and variance
- deviation : It describes the gap between the expected value of the predicted value and the real value . The bigger the deviation , The more it deviates from the real data set
- variance : It describes the range of changes in the predicted values , The degree of dispersion , That is, the distance from the expected value . The greater the variance , The distribution of prediction result data is more scattered
5.2 Based on deviation / Error of variance
- Error based on deviation : It is the difference between the expected prediction of the model and the real value to be predicted . Deviation is used to measure the difference between the prediction of the model and the true value .
- Error based on variance : The error based on variance describes the variability of a model to predict a given data . such as , When you repeat the process of building a complete model many times , Variance is how much it changes between different relationships in the prediction model .
5.3 make a concrete analysis

- Top left : Low deviation bias, Low variance variance. The accuracy of prediction results is very high , And the model is relatively robust ( Stable ), The prediction results are highly concentrated .
- Top right : Low deviation bias, High variance variance. The accuracy of prediction results is high , But the model is unstable , The prediction results are divergent .
- Bottom left : High deviation bias, Low variance variance. The accuracy of prediction results is low , But the model is stable , The prediction results are relatively concentrated .
The accuracy of the results is very high , And the model is relatively robust ( Stable ), The prediction results are highly concentrated . - Top right : Low deviation bias, High variance variance. The accuracy of prediction results is high , But the model is unstable , The prediction results are divergent .
If you think the article is good , You can give me some praise and encourage me , Welcome friends to collect and study
Pay attention to me , Let's study together , Progress together !!!
边栏推荐
- 16.偏差、方差、正则化、学习曲线对模型的影响
- Simple use and understanding of laravel message queue
- 乱打日志的男孩运气怎么样我不知道,加班肯定很多!
- IDEA 连接 数据库
- What does the expression > > 0 in JS mean
- Newscenter, advanced area of attack and defense world web masters
- Camera Hal OEM module ---- CMR_ preview.c
- Applet verification code login
- 手把手教你安装Latex(保姆级教程)
- PTA (daily question) 7-74 yesterday
猜你喜欢

Statistical analysis of time series

Software designer afternoon question

Samsung asset management (Hong Kong) launched yuancosmos ETF to focus on investing in the future tuyere track

Plato farm is expected to further expand its ecosystem through elephant swap

15.模型评估和选择问题

Html+css+php+mysql realize registration + login + change password (with complete code)

PTA (daily question) 7-72 calculate the cumulative sum

vulnhub:BTRSys2

“吃货联盟定餐系统”

还在写大量 if 来判断?一个规则执行器干掉项目中所有的 if 判断...
随机推荐
Attack and defense world web master advanced area PHP_ rce
Okaleido ecological core equity Oka, all in fusion mining mode
MySQL installation and configuration tutorial (super detailed, nanny level)
#{}和${}的区别
Recursion / backtracking (middle)
Field injection is not recommended solution
MySql中的like和in走不走索引
Do like and in indexes in MySQL go
@PostConstruct注解详解
Laravel permission control
MySQL stored procedure
Idea error running 'application' command line is too long solution
12个MySQL慢查询的原因分析
动态规划问题(六)
Cause analysis of 12 MySQL slow queries
CV instance segmentation model sketch (1)
Idea connection database
Introduction and solution of common security vulnerabilities in web system CSRF attack
Dynamic programming (V)
Dynamic programming problem (3)