当前位置：网站首页>5. Over fitting, dropout, regularization

5. Over fitting, dropout, regularization

2022-07-08 01:02:00 【booze-J】

article

Over fitting

Insert picture description here

Over fitting leads to larger test error ：

You can see that as the model structure becomes more and more complex , The error of training set is getting smaller and smaller , The error of the test set decreases first and then increases , Over fitting leads to larger test error .
The better case is that the two lines of training error and test error are relatively close .

Prevent over fitting

1. Increase the data set

There is a popular saying in the field of data mining ,“ Sometimes having more data is better than a good model ”. Generally speaking, more data participate in training , The better the training model . If there is too little data , And if the neural network we build is too complex, it is easier to produce the phenomenon of over fitting .
Insert picture description here

2.Early stopping

In training the model , We often set a relatively large number of generations .Early stopping It is a strategy to end training in advance to prevent over fitting .

The general practice is to record the best so far validation accuracy, As the continuous 10 individual Epoch Not reaching the best accuracy when , You could say accuracy It's not improving anymore . At this point you can stop iterating （Early Stopping）.

3.Dropout

Insert picture description here
Every time I train , Will turn off some neurons randomly , Closing does not mean removing , Instead, these dotted neurons do not participate in training . Pay attention to the general training , When testing the model , Is to use all neurons , It's not going to happen dropout.

4. Regularization

C0 Represents the original cost function ,n Represents the number of samples , $\lambda$ That's the coefficient of the regular term , Weigh regular terms against C0 Proportion of items .
L1 Regularization ：

Insert picture description here
L1 Regularization can achieve the effect of sparseness of model parameters .

L2 Regularization ：
Insert picture description here
L2 Regularization can attenuate the weight of the model , Make the model parameter values close to 0.

Insert picture description here
When $\lambda$ =0.001 when , Over fitting phenomenon appears , When $\lambda$ =0.01 when , There is a slight over fitting , When $\lambda$ =0.1 There was no fitting phenomenon when .