当前位置:网站首页>How to prevent overfitting in cross validation
How to prevent overfitting in cross validation
2022-07-07 01:21:00 【ZEERO~】
1、 Definition of over fitting and under fitting
Over fitting It means that the model performs well in the training set , Poor performance in validation set and test set ;
Under fitting It refers to the model in the training set 、 Test set 、 The performance on the verification set is very poor .
2、 Analysis of the causes of over fitting and under fitting
2.1 Number of samples
We know , The number of samples for machine learning algorithm , Suppose the model is suitable for big data sets , The more samples, the better . When the number of samples is insufficient , Under fitting will occur , The performance of the model on the three data sets is very poor .
2.2 Model complexity
Generally speaking , When we select the model , For example, logical regression , Linear regression , The more features are used , The higher the complexity of the model . We can use feature selection algorithm , for example MRMR、 Chi square test , Rank the importance of features . Then add features in turn , Calculate the accuracy and loss function of training set and test set . We usually find that , As the number of features increases , The accuracy of the training set will gradually tend to 100%, The accuracy of the test set will gradually decline . The loss of training set will gradually decrease to 0, The loss of test sets will gradually increase . For example, , When the training set loss is 0, The test set loss is not 0 when , We know that the model must have been fitted . such , We can roughly judge whether the current model has been fitted .
3、 Why cross validation can prevent over fitting
The first thing to note is , It's not that cross validation will reduce the complexity of the model or how to prevent the model from over fitting , Instead, the behavior of cross validation allows us to evaluate whether the model is over fitted during training .
We know ,5 Fold cross validation is random 80% Data for training ,20% To verify the data . In this case , If the model has been fitted ,
边栏推荐
- Data type of pytorch tensor
- 2022 Google CTF SEGFAULT LABYRINTH wp
- Asset security issues or constraints on the development of the encryption industry, risk control + compliance has become the key to breaking the platform
- [Niuke] [noip2015] jumping stone
- MySQL script batch queries all tables containing specified field types in the database
- C # method of calculating lunar calendar date 2022
- 资产安全问题或制约加密行业发展 风控+合规成为平台破局关键
- NEON优化:性能优化常见问题QA
- 2022 Google CTF SEGFAULT LABYRINTH wp
- Can the system hibernation file be deleted? How to delete the system hibernation file
猜你喜欢

微信公众号发送模板消息

ARM裸板调试之JTAG原理
![[100 cases of JVM tuning practice] 04 - Method area tuning practice (Part 1)](/img/7a/bd03943c39d3f731afb51fe2e0f898.png)
[100 cases of JVM tuning practice] 04 - Method area tuning practice (Part 1)

Can the system hibernation file be deleted? How to delete the system hibernation file

如何管理分布式团队?

Boot - Prometheus push gateway use

免费白嫖的图床对比

2022 Google CTF SEGFAULT LABYRINTH wp

黑马笔记---异常处理

Typical problems of subnet division and super network construction
随机推荐
UI控件Telerik UI for WinForms新主题——VS2022启发式主题
【信号与系统】
Atomic in golang, and cas Operations
Openjudge noi 1.7 10: simple password
Boot - Prometheus push gateway use
BFS realizes breadth first traversal of adjacency matrix (with examples)
Wood extraction in Halcon
Data type of pytorch tensor
[case sharing] basic function configuration of network loop detection
Maidong Internet won the bid of Beijing life insurance to boost customers' brand value
Spark TPCDS Data Gen
The MySQL database in Alibaba cloud was attacked, and finally the data was found
前置机是什么意思?主要作用是什么?与堡垒机有什么区别?
HMM notes
黑马笔记---创建不可变集合与Stream流
Docker method to install MySQL
机器学习:随机梯度下降(SGD)与梯度下降(GD)的区别与代码实现。
Failed to successfully launch or connect to a child MSBuild. exe process. Verify that the MSBuild. exe
树莓派/arm设备上安装火狐Firefox浏览器
c语言—数组