当前位置:网站首页>Advantages and disadvantages of evaluation methods
Advantages and disadvantages of evaluation methods
2022-07-06 10:25:00 【How about a song without trace】
1、 Over fitting : When the learner learns the training samples well , It is possible to take the characteristics of the trained samples as the general properties of all potential samples , This will lead to the decline of Pan China capability ( Generalization ability refers to the ability of the learning model to be applied to unknown samples ).
2、 Under fitting : Low learning ability , I think the general characteristics are all characteristics .
Evaluation methods :
- Set aside method : If the training set contains the vast majority of samples , Then the trained sample may be close to the desired training model , But because of the small test set , The assessment results may not be accurate enough , The pattern of basic partitioned data sets :2:1,4:1 The front is used for training , The latter is used for testing .
- Cross validation : Equal division , Stratified sampling , Take the mean , The defect is : Large data sets are too expensive , Spend more time .
- Self help law : Loop from the overall data into the sample , Put it back again , The final initial data are 0.368 The sample of does not appear , Used for testing . The self-help method can be used to test from the samples that appear in the initial data set , Such a test is also known as out of package estimation . advantage : The self-help method is smaller in the data set , It's hard to divide training effectively \ Test sets are useful , Multiple different training sets can be generated from the initial data set , shortcoming : But it changes the distribution of data sets , This will introduce Estimated deviation .
But when the initial data volume is enough , Set aside method and cross validation method are more commonly used .
Participate in the final parameter model :
General rules of parameter adjustment : Select a range and a varying step size for each parameter , This is a compromise between computational overhead and performance .
Performance metrics : Measure the pan China capability of the model , Performance depends not only on Algorithms and data , It also determines mission requirements .
The most commonly used performance measure for regression tasks : Mean square error .
Recall rate (TP/(TP+FN))、 Precision rate (TP/(TP+FP)):TP Real examples FP False positive example TN True counter example FN False counter example .
F1 It is based on the harmonic average of recall and precision :2*TP/( Total number of samples +TP-TN)
ROC: Characteristics of test work . The horizontal axis TPR( Real examples )=TP/(TP+FN), The vertical axis FPR( False positive example ):FP/(TN+FP).
Normalization : Map values from different ranges of variation to the same fixed range , Common is [0,1], Also known as normalization .
deviation : The difference between the expected output and the real tag , Describe the fitting ability of the learning algorithm itself .
Generalization error can be decomposed into deviation 、 variance ( Have you measured the change of learning performance caused by the change of the same size training set , The impact of data perturbation is characterized )、 And noise ( The lower bound of the expected generalization error that any learning algorithm can achieve in the current task is expressed ) The sum of the .
边栏推荐
- Not registered via @enableconfigurationproperties, marked (@configurationproperties use)
- 【C语言】深度剖析数据存储的底层原理
- Inject common SQL statement collation
- Set shell script execution error to exit automatically
- A necessary soft skill for Software Test Engineers: structured thinking
- Routes and resources of AI
- Contest3145 - the 37th game of 2021 freshman individual training match_ B: Password
- Mexican SQL manual injection vulnerability test (mongodb database) problem solution
- Solve the problem of remote connection to MySQL under Linux in Windows
- text 文本数据增强方法 data argumentation
猜你喜欢
随机推荐
14 医疗挂号系统_【阿里云OSS、用户认证与就诊人】
MySQL實戰優化高手08 生產經驗:在數據庫的壓測過程中,如何360度無死角觀察機器性能?
Simple solution to phpjm encryption problem free phpjm decryption tool
CDC: the outbreak of Listeria monocytogenes in the United States is related to ice cream products
jar运行报错no main manifest attribute
Carolyn Rosé博士的社交互通演讲记录
实现以form-data参数发送post请求
该不会还有人不懂用C语言写扫雷游戏吧
oracle sys_ Context() function
MySQL combat optimization expert 04 uses the execution process of update statements in the InnoDB storage engine to talk about what binlog is?
The 32-year-old fitness coach turned to a programmer and got an offer of 760000 a year. The experience of this older coder caused heated discussion
MySQL实战优化高手04 借着更新语句在InnoDB存储引擎中的执行流程,聊聊binlog是什么?
16 medical registration system_ [order by appointment]
C杂讲 文件 初讲
Safety notes
美疾控中心:美国李斯特菌疫情暴发与冰激凌产品有关
Chrome浏览器端跨域不能访问问题处理办法
Not registered via @enableconfigurationproperties, marked (@configurationproperties use)
MySQL combat optimization expert 03 uses a data update process to preliminarily understand the architecture design of InnoDB storage engine
The appearance is popular. Two JSON visualization tools are recommended for use with swagger. It's really fragrant