当前位置:网站首页>Complete set of machine learning classification task effect evaluation indicators (including ROC and AUC)
Complete set of machine learning classification task effect evaluation indicators (including ROC and AUC)
2022-07-27 18:36:00 【zkkkkkkkkkkkkk】
Catalog
1.1、 What is confusion matrix ?
1.2、 What does the confusion matrix look like ?
1.3、 Common binary confusion matrix
2.1、 Index of confusion matrix
2.2、 Secondary index of confusion matrix
2.2.3、 Recall rate ( sensitivity ):
2.3、 Three level indicators of confusion matrix
4.1、Python Realization ROC curve
1.1、 What is confusion matrix ?
Confusion matrix is also called error matrix (Confusion Matrix), It is used to calculate the index of classification problems . For example, classification indicators : Accuracy rate ( Accuracy rate ), Accuracy , Recall rates and so on . We can all calculate through the confusion matrix . As follows
1.2、 What does the confusion matrix look like ?
Confusion matrix is used to summarize the results of a classifier . about k Metaclassification , In fact, it is a k x k Of k Dimension table , Used to record the prediction results of the classifier .
1.3、 Common binary confusion matrix
| True for 1 | True for 0 | |
| Forecast as 1 | TP | FP |
| Forecast as 0 | FN | TN |
among :
The real sample is 1, The predicted sample result is 1. It is called true positive .(True Postive) abbreviation TP.
The real sample is 1, The predicted sample result is 0. Called false negative .(False Negative) abbreviation FN.
The real sample is 0, The predicted sample result is 1. Called false positive .(False Postive) abbreviation FP.
The real sample is 0, The predicted sample result is 0. Called true negative .(True Negative) abbreviation TN.
notes :FN The situation is actually the second kind of statistical error (Type II Error), We can understand it as letting the bad guys go ,
FP The situation is actually the first kind of statistical error (Type I Error), We can understand it as killing good people by mistake
2.1、 Index of confusion matrix
As the output result of two categories , We definitely hope our classifier is as accurate as possible . Then the corresponding confusion matrix is TP and TN The more the better , and FP and FN The less, the better. . After knowing this decision-making method , We often observe our TP and TN In the grid The amount of data .
And because the confusion matrix can only be observed TP and TN The number of , Confusion matrix, whether it is TP or TN or FP Or FN, Only the number of samples is counted . It does not completely represent the quality of the classifier . Sometimes in specific different scenes , Our focus is also different . So there are classified secondary indicators .
2.2、 Secondary index of confusion matrix
| True for 1 | True for 0 | |
| Forecast as 1 | TP | FP |
| Forecast as 0 | FN | TN |
2.2.1、 Accuracy rate :
The proportion of all samples with correct prediction in the total sample .
2.2.2、 Accuracy :
All forecasts are 1 In a sample of , Actually, it's also 1 Proportion of samples .

2.2.3、 Recall rate ( sensitivity ):
All truths are 1 In a sample of , The prediction is correct ( by 1) Proportion of samples .

2.2.4、 Specificity :
All forecasts are 1 In a sample of , True for 1 Proportion of samples

2.3、 Three level indicators of confusion matrix
2.3.1、F1-Score
According to the secondary index , It extends a three-level indicator . namely F1-score. It combines accuracy (Precision) And recall rate (Recall). The formula is as follows :

notes :F1-score It's a 0-1 Decimal between , The closer the 1 Indicates that the classification result is better .
3.1、ROC Curves and AUC area
The following figure shows one I trained and spent using logistic regression ROC diagram .

It's not hard for us to see ROC The curve is based on each sample point TPR Values and FPR value , A picture drawn . The horizontal axis is FPR, The vertical axis is TPR. Below the curve is AUC area ,AUC The value of is usually in 0.5~1 Between ,AUC The larger the area, the better ,ROC The closer the curve is to the upper left, the better .
4.1、Python Realization ROC curve
Python Of sklearn Already encapsulated ROC Interface of curve , We can directly call the incoming parameters to output .
from matplotlib import pyplot as plt
plot_roc_curve(lr, test_x, test_y) # test_x: Test sample set ;test_y: Test tag set
plt.title("ROC curve ")
plt.show()
5.1、 other
There are many machine learning classification indicators , Columns such as :PR curve ,KS value ,AR value ,KS Curves and so on .
边栏推荐
- Jianan Yunzhi has completed the pre roadshow and is expected to land on NASDAQ on November 20
- [MIT 6.S081] Lab 4: traps
- [MIT 6.S081] Lab 11: networking
- 微信小程序获取手机号
- 知识图谱 — pyhanlp实现命名体识别(附命名体识别代码)
- Labels such as {@code}, {@link} and < P > in the notes
- The end of another era!
- 输入框blur事件与click事件冲突问题
- Let's move forward together, the 10th anniversary of Google play!
- 2021.8.1 notes DBA
猜你喜欢
随机推荐
Solve the problem that reids cannot be accessed by other IPS
Wechat applet wxacode.getunlimited generates applet code
2021.7.31 note view
2021.8.1笔记 数据库设计
[MIT 6.S081] Lab 11: networking
@Convert 注解在jpa中进行查询的注意事项
[mit 6.s081] LEC 9: interrupts notes
Software installation related
Binary tree concept
2021.7.28笔记 事务
机器学习分类任务效果评估指标大全(包含ROC和AUC)
2021.7.30 note index
[MIT 6.S081] Lec 9: Interrupts 笔记
2021.7.12笔记 内外连接
Mybtis-Plus常用的内置方法
Meituan Er Mian: why does redis have sentinels?
MySQL code database creation parking management system foreign key
Jrs-303 usage
[MIT 6.S081] Lab 6: Copy-on-Write Fork for xv6
超实用!阿里P9私藏的Kubernetes学习笔记,看完直呼NB


![[MIT 6.S081] Lec 8: Page faults 笔记](/img/e2/0f5332dd9d2b439bcf29e87a9fa27f.png)






![[MIT 6.S081] Lec 4: Page tables 笔记](/img/30/f1e12d669b656a0b14aa8c7b210085.png)