当前位置：网站首页>Evaluation index of machine learning classification model and implementation of sklearn code

Evaluation index of machine learning classification model and implementation of sklearn code

2022-07-29 09:18:00 【Icy Hunter】

List of articles

Preface
Accuracy（ Accuracy rate ）、Recall（ Recall rate ）、Precision（ Accuracy ）、 F1 score （F1-Score）
TPR、FPR、ROC、AUC、AP
- TPR、FPR
- ROC
- AUC
- AP
Reference resources ：

Preface

Because the model evaluation index is still very important for the model , And all kinds of , At first, I thought that there was only a very simple idea to evaluate the classification model ： The prediction is right / total , Only later , Sometimes this indicator is useless … For example, when positive and negative samples are out of balance . therefore , We still need to sort out the evaluation indicators of the classification model , Sort it out first , I'll make up for it later .

Accuracy（ Accuracy rate ）、Recall（ Recall rate ）、Precision（ Accuracy ）、 F1 score （F1-Score）

Take the dichotomous model as an example , Suppose we need to evaluate a breast cancer diagnosis classifier , Which label 1 It means positive , Indicates breast cancer , label 0 Represents negative , Indicates no breast cancer . that TP、FP、FN、TN As shown in the following table ：

Insert picture description here
Where the true represents the true value of the data , Prediction represents the predicted value of the model ,P、N Represent the real positive number and negative number respectively .
Suppose we have such a set of data ：

y_true = [1, 1, 0, 1, 0, 0]
y_pred = [1, 1, 1, 1, 0, 1]

y_true For real labels ,y_pred The prediction label for the model , Then bring in the form to get ：
Insert picture description here
TP It means that it was originally positive , The number predicted by the model is also positive
FN It means that it was originally positive , The number that is not positive after model prediction
FP、TN The same is true .
We can go through sklearn Check to see if the result is correct ：

from sklearn.metrics import confusion_matrix
y_true = [1, 1, 0, 1, 0, 0]
y_pred = [1, 1, 1, 1, 0, 1]
TN, FP, FN, TP = confusion_matrix(y_true, y_pred).ravel()
print(TN, FP, FN, TP)

Output ：
Insert picture description here
Results the correct
Then we can calculate according to the formula Accuracy（ Accuracy rate ）、Recall（ Recall rate ）、Precision（ Accuracy ）

Accuracy（ Accuracy rate ）

Insert picture description here
In a nutshell Predict the right sample / Total sample , here acc=4/6

Recall（ Recall rate ）

Insert picture description here
Take masculinity as an example
Simply put, it is used to judge whether the model can find positive samples well , namely The number of positive samples identified by the model / Actual number of positive samples , Positive here recall = 3 / 3 = 100%
Description Model recall very nice , Because breast cancer would rather be killed by mistake than let go , Therefore, a higher recall rate is required , A high degree of recognition is required for positive samples .

Precision（ Accuracy ）

Insert picture description here
Take masculinity as an example
Accuracy is also called precision , It can be simply understood that the model thinks you are positive , And the probability of actually being positive , Positive here Precision = 3 / 5

F1 score （F1-Score）

recall and precision Harmonic mean of .
Insert picture description here

F1 Namely precision and recall A comprehensive consideration standard , Because some models need recall High talents can be used in practice , Some need precision high , Others need a compromise between the two .
Take positive as an example , here f1 = 2 * 0.6 * 1 / 0.6 + 1 = 0.75

So far, it is almost understandable , We can use sklearn To verify the results ：

from sklearn.metrics import classification_report
y_true = [1, 1, 0, 1, 0, 0]
y_pred = [1, 1, 1, 1, 0, 1]
print(classification_report(y_true, y_pred))

Output ：
Insert picture description here
Visible positive （ label 1） Of samples of precision、recall、f1-score And calculated acc It's all right , In line with expectations .

TPR、FPR、ROC、AUC、AP

Suppose the data are as follows ：

y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])

y_true For the corresponding label ,y_scores Is the predicted value for the tag , Then we can determine a threshold and y_scores Make comparison , Those greater than the threshold are positive , Less than the threshold is negative .

TPR、FPR

Suppose the threshold is 0.5
that y_pred = [0, 0, 0, 1]
TPR For the detection rate , namely Tested positive is actually positive / All positive = 1 / 2 = 0.5
FRP Is the false detection rate , namely Tested positive, but actually not positive / All non positive = 0 / 2 = 0
Then we can get one （FPR,TPR） The coordinate point of , You can draw
Of course, if we change the threshold , For example, replace with 0.39, You can get another （FPR,TPR） The coordinate point of
sklearn There are calculation thresholds and corresponding TPR、FRP Function of

import numpy as np
from sklearn.metrics import roc_auc_score,roc_curve
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
#  Positive is 1
FPR, TPR, thresholds = roc_curve(y_true, y_scores, pos_label=1)
print(FPR)
print(TPR)
print(thresholds)
# true
# 0 0 1 1

# 1.8
# 0 0 0 0

# 0.8
# 0 0 0 1

# 0.4
# 0 1 0 1

# 0.35
# 0 1 1 1

# 0.1
# 1 1 1 1
#  this 5 The threshold has covered all situations

ROC

When all thresholds are taken, all （FPR,TPR） After all calculations , Draw each point on the graph , Then the connecting line is ROC 了 .

import numpy as np
from sklearn.metrics import roc_auc_score,roc_curve
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
#  Positive is 1
FPR, TPR, thresholds = roc_curve(y_true, y_scores, pos_label=1)
print(FPR)
print(TPR)
print(thresholds)
import matplotlib.pyplot as plt
plt.scatter(FPR, TPR)
plt.plot(FPR, TPR)
plt.show()

Running results ：
Insert picture description here
This is the of this set of data ROC It's curved
so （0,1） This point should be the best threshold division .

AUC

ROC The shape of the curve is not easy to quantify , So there was AUC. Namely ROC Curve and x The area enclosed by the axis
here AUC=1 * 0.5 + 0.5 * 0.5 = 0.75
You can use cute sklearn Check it out ：

import numpy as np
from sklearn.metrics import roc_auc_score,roc_curve
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
#  Positive is 1
FPR, TPR, thresholds = roc_curve(y_true, y_scores, pos_label=1)
AUC = roc_auc_score(y_true, y_scores)
print(AUC)

Output ：
Insert picture description here
So it's in line with expectations .

AP

AP and AUC almost , It's just for （recall,precision） To describe

import numpy as np
from sklearn.metrics import precision_recall_curve
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
#  Positive is 1
precision, recall, thresholds=precision_recall_curve(y_true,y_scores,pos_label=1)
print(precision)
print(recall)
print(thresholds)
import matplotlib.pyplot as plt
plt.scatter(recall, precision)
plt.plot(recall, precision)
plt.ylim(0, 1.2)
plt.xlim(0, 1.2)
plt.show()

The operation results are as follows ：
Insert picture description here
But you can't connect points directly to find the area , Because its calculation formula is as follows ：

Then the curve of the graph should be drawn like this ：

AP = 0.5 * 1 + 0.5 * 0.6666666 = 0.8333333
We can also verify ：

from sklearn.metrics import average_precision_score
import numpy as np
y_true = np.array([0, 0, 1, 1])
y_scores = np.array([0.1, 0.4, 0.35, 0.8])
AP = average_precision_score(y_true, y_scores)
print(AP)