当前位置:网站首页>Cross validation (CV) learning notes
Cross validation (CV) learning notes
2022-07-25 17:34:00 【Wsyoneself】
- Cross validation can be used to evaluate the performance of machine learning training model , Parameter optimization can also be carried out .
- Common methods of dividing data sets : Directly divide the sample data into training and verification data sets . shortcoming : There is no cross method , The validation data set has no contribution to the training of the model .
- Common cross validation methods :
- k-flod cv:
- The sample data is divided into k Group , One set at a time as a validation data set , The rest k-1 Group as training data set . Then we get k A training model , take k The mean value of the validation accuracy of the models is used as the performance index of the model
- advantage : All samples will be used for model training , The evaluation result is credible .
- leave-one-out cv: Let the original data set contain n Samples , Select one sample at a time as the validation data set , rest n-1 Samples as training data set , Will have a n A training model , take n The average validation accuracy of the training models is the performance index of the model .
- advantage : ditto
- shortcoming : There are many models that need training , And the training data set is large , High calculation cost
- k-flod cv:
- In order to further improve the performance of the model in predicting unknown data , Different parameter settings need to be optimized and compared , This process is called model selection . For a particular problem , The process of adjusting parameters to find the optimal super parameters .
- Judge the training condition of the model according to the deviation and variance :
- Deviation describes the difference between the predicted value and the real value
- Variance describes the variation range of the predicted value , The degree of dispersion , The greater the variance , The more scattered the distribution of the prediction result data .
- High deviation is under fitting , High variance is over fitting . Because deviation refers to how much data we ignore , Variance refers to the dependence of the model on data
- High variance : The model changes significantly according to the training data set
- Validation sets can prevent over fitting .
- Set up the pre-test evaluation model , And make improvements before the real test , This prediction trial is called a verification set .
- Evaluate the degree of data fitting , Use the cost function J=aJtrain( Training set error )+bJcv( Cross validation set error )
- Regularization term :
- Generally, it is a monotone increasing function of model complexity , The more complex the model , The larger the value of the regularization term , For example, the regularization term can be the norm of the model parameter vector .
- From the perspective of Bayesian estimation , The regularization term corresponds to the prior probability of the model
- L1、L2 Regularization can be understood as the introduction of a priori distribution into the model ,L1 Regularization introduces Laplace distribution ,L2 Regularization introduces Gaussian distribution .
- Laplace is distributed in 0 Highlight near value , And Gaussian distribution in 0 The distribution around the value is flat , The distribution on both sides is sparse . Correspondingly ( In fact, it is against , Because the training process is to minimize the loss ),L1 Regularization tends to sparse models ,L2 Regularization imposes heavy penalties on parameters with high weights .
- The regularization term corresponds to the prior information in the posterior probability estimation , The loss function corresponds to the likelihood function , The product of the two yields the Bayesian maximum a posteriori probability .
- Logarithm of Bayesian posterior probability can be transformed into loss function + Regularization term .
- maximum likelihood : The multiplication of all sample probabilities maximizes
- Select the training method according to the data set :
- When the given data is sufficient , Cut the data into training sets ( Training models ), Verification set ( Model selection ), Test set ( Model to evaluate ). Select the model with the minimum prediction error in the verification set
- When the data set is insufficient , Use cross validation ( Reuse data )
边栏推荐
- 带你初步了解多方安全计算(MPC)
- [Hardware Engineer] can't select components?
- 01.两数之和
- Excel表格 / WPS表格中怎么在下拉滚动时让第一行标题固定住?
- 如何看一本书
- [Hardware Engineer] about signal level driving capability
- Ultimate doll 2.0 | cloud native delivery package
- WPF 实现用户头像选择器
- 我也是醉了,Eureka 延迟注册还有这个坑!
- How to rectify the unqualified EMC of electronic products?
猜你喜欢

ACL 2022 | comparative learning based on optimal transmission to achieve interpretable semantic text similarity

window10系统下nvm的安装步骤以及使用方法

博后招募 | 西湖大学机器智能实验室招聘博士后/助理研究员/科研助理

Stm32 paj7620u2 gesture recognition module (IIC communication) program source code explanation

生成扩散模型漫谈:DDPM = 贝叶斯 + 去噪

Cet

Jenkins' file parameters can be used to upload files

Three dimensional function display of gray image

I2C通信——时序图

"Digital security" alert NFT's seven Scams
随机推荐
Summary of knowledge points for final review of server-side architecture design
第五章:流程控制
The gas is exhausted! After 23 years of operation, the former "largest e-commerce website in China" has become yellow...
PostgreSQL里有只编译语句但不执行的方法吗?
04. Find the median of two positive arrays
Beyond convnext, replknet | look 51 × 51 convolution kernel how to break ten thousand volumes!
EDI docking commercehub orderstream
Thesis reading_ Multi task learning_ MMoE
Jenkins' file parameters can be used to upload files
电子产品“使用”和“放置”哪个寿命更长??
自动化测试 PO设计模型
With 8 years of product experience, I have summarized these practical experience of continuous and efficient research and development
02. Add two numbers
【PHP伪协议】源码读取、文件读写、任意php命令执行
03.无重复字符的最长子串
Starting from business needs, open the road of efficient IDC operation and maintenance
11、照相机与透镜
四六级
WPF 实现用户头像选择器
pgsql有没有好用的图形化管理工具?