当前位置:网站首页>Yolo series target detection post-processing - non maximum suppression

Yolo series target detection post-processing - non maximum suppression

2022-06-09 21:34:00 Cat chaser

         Non maximum suppression (NMS) It is often used in various target detection models , Use NMS It can effectively eliminate the redundant detection frames in the target detection results , Keep the most appropriate detection box .  

        With YOLOv5 For example , Model input is 640*640 when , The inference output results in 20*20,40*40,80*80 The sum of the prediction frames on the three scales is 20*20*3+,40*40*3+80*80*3=25200, Each prediction box contains the center of the detection box xy coordinate , Forecast frame width and height wh, Prediction frame confidence , The confidence level corresponding to each classification . To extract the correct detection results from these tens of thousands of prediction boxes, we need to use non maximum suppression .

         Non maximum suppression is generally divided into confidence suppression and IOU Inhibition , The first is confidence suppression , That is, according to the set threshold , Remove the detection box with confidence less than the threshold from the detection results , Check box with high retention reliability , This step is very important .IOU( Occurring simultaneously than ) Inhibition is more complex , If target detection contains multiple classifications , You need to test each category separately IOU Inhibition , Based on a detection frame with high confidence , Calculate with another similar detection frame IOU value , If IOU The value is greater than the set threshold , It is considered that the other detection frame and the reference detection frame are the same target , It is necessary to delete the detection box .

         IOU Suppression details :IOU That is to calculate the intersection and union ratio of two detection frames of the same category . The intersection and union ratio is the ratio of the intersection area of the two detection frames to the joint area . Suppose that the two check boxes do not intersect at all , So the intersection is 0,IOU That's why 0, Then it can be considered that the two detection frames predict different targets , Need to keep . If the two check boxes are completely coincident ,IOU The value is 1, Two detection frames are predicted to be the same target , Then we need to eliminate a detection box .

         Calculate the intersection : The relative positions of the two detection frames can be divided into completely disjoint , There are several cases of intersection non coincidence and complete coincidence . Calculating the intersection is to obtain the overlapping area of two detection frames , This problem can be divided into two parts, which are calculated separately in the horizontal direction x And vertically y The overlap length on . Suppose the first detection frame is in the horizontal range (x1,x2), The longitudinal range is (y1,y2) The second check box is (x3,x4),(y3,y4). If x1>x4 or x2<x3,y1>y4 or y2<y3 Prove that the two detection frames do not intersect at all , The intersection is 0. In other cases, the two detection frames intersect , The intersection distance can be calculated horizontally and vertically , Yes (x1,x2,x3,x4) Sort , The intersection distance can be obtained by subtracting the middle two values after sorting , The intersection area is obtained by multiplying the horizontal and vertical intersection distances .

# Calculate the intersection 
def getInter(box1, box2):
    box1_x1, box1_y1, box1_x2, box1_y2 = box1[0] - box1[2] / 2, box1[1] - box1[3] / 2, \
                                         box1[0] + box1[2] / 2, box1[1] + box1[3] / 2
    box2_x1, box2_y1, box2_x2, box2_y2 = box2[0] - box2[2] / 2, box2[1] - box1[3] / 2, \
                                         box2[0] + box2[2] / 2, box2[1] + box2[3] / 2
    if box1_x1 > box2_x2 or box1_x2 < box2_x1:
        return 0
    if box1_y1 > box2_y2 or box1_y2 < box2_y1:
        return 0
    x_list = [box1_x1, box1_x2, box2_x1, box2_x2]
    x_list = np.sort(x_list)
    x_inter = x_list[2] - x_list[1]
    y_list = [box1_y1, box1_y2, box2_y1, box2_y2]
    y_list = np.sort(y_list)
    y_inter = y_list[2] - y_list[1]
    inter = x_inter * y_inter
    return inter

         Computational Union : Get the intersection area , It's easy to compute Union , Add the areas of the two detection frames , Subtracting the intersection is the Union area .

# Computational Union 
def getIou(box1, box2, inter_area):
    box1_area = box1[2] * box1[3]
    box2_area = box2[2] * box2[3]
    union = box1_area + box2_area - inter_area
    iou = inter_area / union
    return iou

Concrete realization :

import numpy as np


def nms(pred, conf_thres, iou_thres):
    #  Confidence suppression , If it is less than the confidence threshold, delete 
    conf = pred[..., 4] > conf_thres
    box = pred[conf == True]
    #  Category get 
    cls_conf = box[..., 5:]
    cls = []
    for i in range(len(cls_conf)):
        cls.append(int(np.argmax(cls_conf[i])))
    #  Get category 
    total_cls = list(set(cls))  # Delete duplicates , Get the list of category labels that appear ,example=[0, 17]
    output_box = []   # The prediction box of the final output 
    #  Confidence of different classification candidate boxes 
    for i in range(len(total_cls)):
        clss = total_cls[i]   # Current category label 
        #  Take out all candidate boxes corresponding to the current category from all candidate boxes 
        cls_box = []
        for j in range(len(cls)):
            if cls[j] == clss:
                box[j][5] = clss
                cls_box.append(box[j][:6])
        cls_box = np.array(cls_box)
        box_conf = cls_box[..., 4]   # Take out the candidate box confidence 
        box_conf_sort = np.argsort(box_conf)   # Get sorted index 
        max_conf_box = cls_box[box_conf_sort[len(box_conf) - 1]]
        output_box.append(max_conf_box)   # The candidate box with the highest confidence is output as the first prediction box 
        cls_box = np.delete(cls_box, 0, 0)  # Delete the candidate box with the highest confidence 
        while len(cls_box) > 0:
            max_conf_box = output_box[len(output_box) - 1]     # Take the last one in the output prediction box list as the candidate box of the current maximum confidence 
            del_index = []
            for j in range(len(cls_box)):
                current_box = cls_box[j]      # Current forecast box 
                interArea = getInter(max_conf_box, current_box)    
                iou = getIou(max_conf_box, current_box, interArea)  #  Calculate the ratio of intersection and union 
                if iou > iou_thres:
                    del_index.append(j)   # Determine the index to be removed according to the intersection and union ratio 
            cls_box = np.delete(cls_box, del_index, 0)   # Delete the candidate box to be removed in this round 
            if len(cls_box) > 0:
                output_box.append(cls_box[0])
                cls_box = np.delete(cls_box, 0, 0)
    return output_box

原网站

版权声明
本文为[Cat chaser]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206092100410757.html