当前位置:网站首页>How to prevent overfitting in cross validation
How to prevent overfitting in cross validation
2022-07-07 01:21:00 【ZEERO~】
1、 Definition of over fitting and under fitting
Over fitting It means that the model performs well in the training set , Poor performance in validation set and test set ;
Under fitting It refers to the model in the training set 、 Test set 、 The performance on the verification set is very poor .
2、 Analysis of the causes of over fitting and under fitting
2.1 Number of samples
We know , The number of samples for machine learning algorithm , Suppose the model is suitable for big data sets , The more samples, the better . When the number of samples is insufficient , Under fitting will occur , The performance of the model on the three data sets is very poor .
2.2 Model complexity
Generally speaking , When we select the model , For example, logical regression , Linear regression , The more features are used , The higher the complexity of the model . We can use feature selection algorithm , for example MRMR、 Chi square test , Rank the importance of features . Then add features in turn , Calculate the accuracy and loss function of training set and test set . We usually find that , As the number of features increases , The accuracy of the training set will gradually tend to 100%, The accuracy of the test set will gradually decline . The loss of training set will gradually decrease to 0, The loss of test sets will gradually increase . For example, , When the training set loss is 0, The test set loss is not 0 when , We know that the model must have been fitted . such , We can roughly judge whether the current model has been fitted .
3、 Why cross validation can prevent over fitting
The first thing to note is , It's not that cross validation will reduce the complexity of the model or how to prevent the model from over fitting , Instead, the behavior of cross validation allows us to evaluate whether the model is over fitted during training .
We know ,5 Fold cross validation is random 80% Data for training ,20% To verify the data . In this case , If the model has been fitted ,
边栏推荐
- Lldp compatible CDP function configuration
- 分享一个通用的so动态库的编译方法
- Openjudge noi 1.7 10: simple password
- Metauniverse urban legend 02: metaphor of the number one player
- Analysis of mutex principle in golang
- golang中的atomic,以及CAS操作
- The cost of returning tables in MySQL
- Docker method to install MySQL
- Pytorch中torch和torchvision的安装
- Taro applet enables wxml code compression
猜你喜欢
力扣1037. 有效的回旋镖
Can the system hibernation file be deleted? How to delete the system hibernation file
移植DAC芯片MCP4725驱动到NUC980
Transformation transformation operator
[hfctf2020]babyupload session parsing engine
系统休眠文件可以删除吗 系统休眠文件怎么删除
【案例分享】网络环路检测基本功能配置
2022 Google CTF SEGFAULT LABYRINTH wp
【C语言进阶篇】指针的8道笔试题
C language - array
随机推荐
There is an error in the paddehub application
c语言—数组
【js】获取当前时间的前后n天或前后n个月(时分秒年月日都可)
【C语言进阶篇】指针的8道笔试题
7.6模拟赛总结
UI control telerik UI for WinForms new theme - vs2022 heuristic theme
NEON优化:关于交叉存取与反向交叉存取
THREE.AxesHelper is not a constructor
安利一波C2工具
table表格设置圆角
Maidong Internet won the bid of Beijing life insurance to boost customers' brand value
Lldp compatible CDP function configuration
NEON优化:性能优化常见问题QA
Force buckle 1037 Effective boomerang
ARM裸板调试之JTAG原理
BFS realizes breadth first traversal of adjacency matrix (with examples)
[Niuke] b-complete square
Tensorflow 1.14 specify GPU running settings
405 method not allowed appears when the third party jumps to the website
mysql: error while loading shared libraries: libtinfo.so.5: cannot open shared object file: No such