当前位置：网站首页>Confusion matrix in CNN | pytorch series (XXIII)

Confusion matrix in CNN | pytorch series (XXIII)

2022-07-28 02:53:00 【51CTO】

writing |AI_study

CNN Confusion matrix in | PyTorch series （ 23 ）_ Confusion matrix

The original title ：CNN Confusion Matrix With PyTorch - Neural Network Programming

In this lesson , We're going to build some functions , Let's get the prediction tensor for each sample in the training set . then , We'll see how to use this prediction tensor , And the label for each sample , To create a confusion matrix . This confusion matrix will allow us to see which categories in our network are confused with each other .

Prepare the data
Build a model
Training models
Analysis of the results of the model

structure 、 Drawing and interpreting a confusion matrix

For all the code setup details , Please refer to the previous section of this course .

Confusion matrix requires

To create a confusion matrix for the entire dataset , We need a one-dimensional prediction tensor with the same length as the training set .

       
       > len(train_set)
       
       60000
      
1.
2.

This prediction tensor will contain the... Of each sample in our training set 10 Forecast ( One for each clothing category ). After we get this tensor , We can use the label tensor to generate a confusion matrix .

       
       > len(train_set.targets)
       
       60000
      
1.
2.

A confusion matrix will tell us where the model is confused . More specifically , The confusion matrix will show the categories predicted correctly by the model and those predicted incorrectly by the model . For incorrect predictions , We will be able to see the categories predicted by the model , This will tell us which categories confuse the model .

Get predictions for the entire training set

In order to get the prediction of all training set samples , We need to pass all the samples over the network . So , You can create a batch_size=1 Of DataLoader. This will pass a batch of data to the network at once , And provide the required prediction tensor for all training set samples .

However , According to the size of computing resources and training set , If we train on different datasets , We need a way to predict smaller batches and collect results . To collect the results , We will use torch.cat() Function connects output tensors together , To obtain a single prediction tensor . Let's build a function .

Set up a function to get predictions for all samples

We will create a new one called get_all_preds() Function of , And pass a model and a data loader . The model will be used to obtain predictions , And the data loader will be used to provide batches from the training set .

All the functions need to do is traverse the data loader , Pass the batch to the model , And link the results of each batch to a prediction tensor , The tensor will be returned to the caller .

       
       @
       
       torch.
       
       no_grad()
       
       def 
       
       get_all_preds(
       
       model, 
       
       loader):
       
       all_preds 
       
       = 
       
       torch.
       
       tensor([])
       
       for 
       
       batch 
       
       in 
       
       loader:
       
       images, 
       
       labels 
       
       = 
       
       batch
       
       preds 
       
       = 
       
       model(
       
       images)
       
       all_preds 
       
       = 
       
       torch.
       
       cat(
       
            (
       
       all_preds, 
       
       preds)
       
            ,
       
       dim
       
       =
       
       0
       
        )
       
       return 
       
       all_preds
      
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.

The implanting of this function creates an empty tensor ,all_preds To save the output forecast . then , Iterating batch processing from data loader , And will output prediction with all_preds Tensors are connected together . Last , All predictions all_preds Will be returned to the caller .

Please note that , At the top , We have used @ torch.no_grad() PyTorch Decoration annotates the function . This is because we want the function to perform ignore gradient tracking .

This is because gradient tracking takes up memory , And reasoning （ Get predictions without training ） period , There is no need to track the calculation diagram . Decorator is a way to partially turn off gradient tracking while performing a specific function .

Local ban PyTorch Gradient tracking

We are now ready to call to get the prediction of the training set . All we need to do is create a data loader with a reasonable batch size , And pass the model and data loader to get_all_preds() function .

In the last section , We learned how to use when not needed PyTorch Gradient tracking function of , And turn it back on when you start training .

Whenever we want to use Backward（） Function to calculate the gradient , We need gradient computing in particular . otherwise , It's a good idea to shut it down , Because turning it off will reduce the memory consumption of computation , for example When we use the Internet to predict （ Reasoning ） when .

Both options work . Let's keep all of this and get our predictions .

Using the prediction tensor

Now? , With the prediction tensor , We can pass it on to the one we created in the previous section get_num_correct（） Function and training set tags , To get the total number of correct predictions .

       
       > preds_correct 
       
       = get_num_correct(train_preds, train_set.targets)
       
> print(
       
       'total correct:', preds_correct)
       
> print(
       
       'accuracy:', preds_correct / len(train_set))
       
total correct: 
       
       53578
       
accuracy: 
       
       0.8929666666666667
      
1.
2.
3.
4.
5.
6.
7.

We can see the total number of correct predictions , And print the accuracy by dividing the number of samples in the training set .

Building confusion matrix

Our task in constructing the confusion matrix is to compare the number of predicted values with the true values （ The goal is ） Compare .

This creates a matrix that acts as a heat map , Tell us where the predicted value falls relative to the true value .

So , We need to have the target tensor and train_preds Prediction tags in tensors .

       
       > train_set.targets
       
tensor([9, 
       
       0, 
       
       0,  ..., 
       
       3, 
       
       0, 
       
       5])
       
> train_preds.argmax
       
       (dim
       
       =
       
       1)
       
tensor([9, 
       
       0, 
       
       0,  ..., 
       
       3, 
       
       0, 
       
       5])
      
1.
2.
3.
4.
5.
6.

Now? , If we compare two tensors by element , We can see if the predicted tag matches the target . Besides , If we want to calculate the number of prediction tags and target tags , Then the values in the two tensors will be the coordinates of the matrix . Let's stack these two tensors along the second dimension , So that we can have 60,000 An orderly pair of .

       
       > stacked 
       
       = torch.stack(
       
    (
       
        train_set.targets
       
        ,train_preds.argmax
       
       (dim
       
       =
       
       1)
       
    )
       
       ,dim
       
       =
       
       1
       
)
       
> stacked.shape
       
torch.Size([60000, 
       
       2])
       
> stacked
       
tensor([
       
    [9, 
       
       9],
       
    [0, 
       
       0],
       
    [0, 
       
       0],
       
    ...,
       
    [3, 
       
       3],
       
    [0, 
       
       0],
       
    [5, 
       
       5]
       
])
       
> stacked[0].tolist()
       
[9, 
       
       9]
      
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.

Now? , We can traverse these pairs of , And calculate the number of occurrences of each position in the matrix . Let's create a matrix . Because we have ten forecast categories , So there's going to be a matrix of ten times ten . Check here for stack（） function .

https://deeplizard.com/learn/video/kF2AlpykJGY

       
       > cmt 
       
       = torch.zeros(10,10, 
       
       dtype
       
       =torch.int64)
       
> cmt
       
tensor([
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0],
       
    [0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0, 
       
       0]
       
])
      
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.

Now? , We're going to traverse the predicted target pair , And add a... To the value in the matrix every time a specific position occurs .

This gives us the following confusion matrix tensor .

Please note that , The following example will have different values , Because these two examples were created at different times .

Draw confusion matrix

The real matrix will be confused numpy.ndarray, We use sklearn.metrics In the library confusion_matrix（） function . Let's import it along with other required imports .

       
       import 
       
       matplotlib.
       
       pyplot 
       
       as 
       
       plt
       
       from 
       
       sklearn.
       
       metrics 
       
       import 
       
       confusion_matrix
       
       from 
       
       resources.
       
       plotcm 
       
       import 
       
       plot_confusion_matrix
      
1.
2.
3.
4.
5.

For the last import , Please note that plotcm It's a document plotcm.py, Located in the resource folder in the current directory . stay plotcm.py In file , There is one called plot_confusion_matrix（） Function of , We will call the function . You will need to implement this function on the system . We'll discuss how to do this later . First , Let's generate a confusion matrix .

We can generate confusion matrices like this ：

       
       > cm 
       
       = confusion_matrix(train_set.targets, train_preds.argmax
       
       (dim
       
       =
       
       1))
       
> print(type(cm))
       
> cm
       
<class 
       
       'numpy.ndarray'>
       
Out[74]:
       
array([[5431,   
       
       14,   
       
       88,  
       
       145,   
       
       26,    
       
       7,  
       
       241,    
       
       0,   
       
       48,    
       
       0],
       
        [   
       
       4, 
       
       5896,    
       
       6,   
       
       75,    
       
       8,    
       
       0,    
       
       8,    
       
       0,    
       
       3,    
       
       0],
       
        [  
       
       92,    
       
       6, 
       
       5002,   
       
       76,  
       
       565,    
       
       1,  
       
       232,    
       
       1,   
       
       25,    
       
       0],
       
        [ 
       
       191,   
       
       49,   
       
       23, 
       
       5504,  
       
       162,    
       
       1,   
       
       61,    
       
       0,    
       
       7,    
       
       2],
       
        [  
       
       15,   
       
       12,  
       
       267,  
       
       213, 
       
       5305,    
       
       1,  
       
       168,    
       
       0,   
       
       19,    
       
       0],
       
        [   
       
       0,    
       
       0,    
       
       0,    
       
       0,    
       
       0, 
       
       5847,    
       
       0,  
       
       112,    
       
       3,   
       
       38],
       
        [1159,   
       
       16,  
       
       523,  
       
       189,  
       
       676,    
       
       0, 
       
       3396,    
       
       0,   
       
       41,    
       
       0],
       
        [   
       
       0,    
       
       0,    
       
       0,    
       
       0,    
       
       0,   
       
       99,    
       
       0, 
       
       5540,    
       
       0,  
       
       361],
       
        [  
       
       28,    
       
       6,   
       
       29,   
       
       15,   
       
       32,   
       
       23,   
       
       26,   
       
       14, 
       
       5827,    
       
       0],
       
        [   
       
       0,    
       
       0,    
       
       0,    
       
       0,    
       
       1,   
       
       61,    
       
       0,  
       
       107,    
       
       1, 
       
       5830]],
       
       dtype
       
       =int64)
      
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.

PyTorch Tensors are like arrays Python object , So we can pass them directly to confusion_matrix（） function . We are relative to train_preds Label tensor of the first dimension transfer training set of tensor （targets） and argmax, This gives us the confusion matrix data structure .

To actually draw the confusion matrix , We need some custom code , This code has been put into the name plotcm In your local file . This function is called plot_confusion_matrix（）.plotcm.py The file needs to contain the following , And in the current directory resources In the folder .

Please note that , You can also just copy this code into your notebook , Or avoid importing anything .

plotcm.py：

       
       import 
       
       itertools
       
       import 
       
       numpy 
       
       as 
       
       np
       
       import 
       
       matplotlib.
       
       pyplot 
       
       as 
       
       plt
       
       def 
       
       plot_confusion_matrix(
       
       cm, 
       
       classes, 
       
       normalize
       
       =
       
       False, 
       
       title
       
       =
       
       'Confusion matrix', 
       
       cmap
       
       =
       
       plt.
       
       cm.
       
       Blues):
       
       if 
       
       normalize:
       
       cm 
       
       = 
       
       cm.
       
       astype(
       
       'float') 
       
       / 
       
       cm.
       
       sum(
       
       axis
       
       =
       
       1)[:, 
       
       np.
       
       newaxis]
       
       print(
       
       "Normalized confusion matrix")
       
       else:
       
       print(
       
       'Confusion matrix, without normalization')
       
       print(
       
       cm)
       
       plt.
       
       imshow(
       
       cm, 
       
       interpolation
       
       =
       
       'nearest', 
       
       cmap
       
       =
       
       cmap)
       
       plt.
       
       title(
       
       title)
       
       plt.
       
       colorbar()
       
       tick_marks 
       
       = 
       
       np.
       
       arange(
       
       len(
       
       classes))
       
       plt.
       
       xticks(
       
       tick_marks, 
       
       classes, 
       
       rotation
       
       =
       
       45)
       
       plt.
       
       yticks(
       
       tick_marks, 
       
       classes)
       
       fmt 
       
       = 
       
       '.2f' 
       
       if 
       
       normalize 
       
       else 
       
       'd'
       
       thresh 
       
       = 
       
       cm.
       
       max() 
       
       / 
       
       2.
       
       for 
       
       i, 
       
       j 
       
       in 
       
       itertools.
       
       product(
       
       range(
       
       cm.
       
       shape[
       
       0]), 
       
       range(
       
       cm.
       
       shape[
       
       1])):
       
       plt.
       
       text(
       
       j, 
       
       i, 
       
       format(
       
       cm[
       
       i, 
       
       j], 
       
       fmt), 
       
       horizontalalignment
       
       =
       
       "center", 
       
       color
       
       =
       
       "white" 
       
       if 
       
       cm[
       
       i, 
       
       j] 
       
       > 
       
       thresh 
       
       else 
       
       "black")
       
       plt.
       
       tight_layout()
       
       plt.
       
       ylabel(
       
       'True label')
       
       plt.
       
       xlabel(
       
       'Predicted label')
      
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.

source -scikit-learn.org

For import , We do this ：

       
       from 
       
       plotcm 
       
       import 
       
       plot_confusion_matrix
      
1.

We're ready to draw the confusion matrix , But first we need to create a list of prediction class names , To pass on to plot_confusion_matrix（） function . The following table shows our prediction classes and their corresponding indexes ：

CNN Confusion matrix in | PyTorch series （ 23 ）_ Confusion matrix _02

This allows us to call to draw the matrix ：

       
       > names = (
       
    'T-shirt/top'
       
    ,'Trouser'
       
    ,'Pullover'
       
    ,'Dress'
       
    ,'Coat'
       
    ,'Sandal'
       
    ,'Shirt'
       
    ,'Sneaker'
       
    ,'Bag'
       
    ,'Ankle boot'
       
)
       
> plt.figure(figsize=(10,10))
       
> plot_confusion_matrix(cm, names)
       
Confusion matrix, without normalization
       
[[5431   14   88  145   26    7  241    0   48    0]
       
[   4 5896    6   75    8    0    8    0    3    0]
       
[  92    6 5002   76  565    1  232    1   25    0]
       
[ 191   49   23 5504  162    1   61    0    7    2]
       
[  15   12  267  213 5305    1  168    0   19    0]
       
[   0    0    0    0    0 5847    0  112    3   38]
       
[1159   16  523  189  676    0 3396    0   41    0]
       
[   0    0    0    0    0   99    0 5540    0  361]
       
[  28    6   29   15   32   23   26   14 5827    0]
       
[   0    0    0    0    1   61    0  107    1 5830]]
      
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.

CNN Confusion matrix in | PyTorch series （ 23 ）_ data _03

Explain the confusion matrix

The confusion matrix has three axes ：

Forecast tags （ class ）
Real label
Thermogram value （ colour ）

Prediction tags and real tags show us the prediction classes we are working on . The diagonal of the matrix represents the position in the matrix where the predicted value is the same as the true value , So we want the heat map here to be darker .

Any value that is not on the diagonal is an incorrect prediction , Because the prediction doesn't match the real label . To read the graph , We can use the following steps ：

Select a prediction tab on the horizontal axis .
Check the diagonal position of this label to see the correct total number .
Check other non diagonal locations to see where network clutter is .

for example , The Internet is going to T T-shirt / The coat is confused with the shirt , But it did not T T-shirt / The coat is confused with the following substances ：

Ankle boot
Sneaker
Sandal

If we think about , It makes sense . As we learn the model , We're going to see the numbers outside the diagonal getting smaller and smaller .

At this point in the series , We have done a lot in PyTorch Build and train CNN The job of . Congratulations ！

The content of the article is carefully studied , My level is limited , Translation cannot be perfect , But it really took a lot of effort , I hope you can move your sexy hands , Share a circle of friends , Support me ^_^

The original English link is ：

https://deeplizard.com/learn/video/0LhiS6yu2qQ>

CNN Confusion matrix in | PyTorch series （ 23 ）_ data _04