当前位置：网站首页>Basis of target detection (NMS)

Basis of target detection (NMS)

2022-07-01 14:05:00 【victor_ gx】

Fundamentals of target detection （NMS）

What is non maximum suppression

Non maximum suppression is a technique mainly used for target detection , To select the best bounding box from a set of overlapping boxes . In the following illustration , The purpose of non maximum suppression is to delete the yellow and blue boxes , In this way, we only have the green box as the final prediction result .

Calculation NMS Steps for

To understand what is boundingbox, as well as IOU The meaning of , I published a previous article on IOU The article . The terms described in the previous article will continue in this article .

Let's first describe NMS The working process in this particular example , Then explain a more general algorithm , Extend it to different scenarios .

1、 Definition of terms

The format of each bounding box we will use is as follows ：
$box\_list=[x1,\ \ y1,\ \ x2,\ \ y2,\ \ class\ \ confidence]$
Let's assume that , For this particular image, we have 3 A bounding box , namely :
$bbox\_list=[blue\_box,yellow\_box,green\_box]$
For each box , Related definitions are as follows
$blue\_box=[x3,\ \ y3,\ \ x4,\ \ y4,\ \ "Cat"\ \ 0.85] \\yellow\_box=[x5,\ \ y5,\ \ x6,\ \ y6,\ \ "Cat"\ \ 0.75] \\green\_box=[x1,\ \ y1,\ \ x2,\ \ y2,\ \ "Cat"\ \ 0.9]$

2、 Filter the candidate boxes according to the confidence level

As NMS The first step in , We sort the boxes in descending order of confidence . After sorting, we get the result as ：
$bbox\_list=[green\_box,blue\_box,yellow\_box]$
Then we define a confidence threshold . Any boxes with confidence below this threshold will be deleted . For this example , False set the confidence threshold to 0.8. Use this threshold , We will delete the yellow box , Because of its confidence <0.8. This leaves us ：
$bbox\_list=[green\_box,blue\_box]$
The result of this operation is shown below :

3、 according to IOU Filter

Because the confidence of the box is in descending order , We know that the first box in the list has the highest confidence . We delete the first box from the list , And add it to the new list . In our case , We will delete the green box , And put it in a new list , such as bbox_list_new.

At this stage , We are IOU An additional threshold is defined . This threshold is used to delete boxes with high overlap . The reasons are as follows ： If the two boxes overlap a lot , And they belong to the same category , It is likely that both boxes cover the same object （ We can verify this from the figure above ）. Because the reality is that each object has only one box , Therefore, we try to delete the boxes with low confidence .

In the example above , Suppose our IOU The threshold for 0.5

Let's start calculating the green box IOU, among bbox_list Each of the remaining boxes in also has the same class . In our case , We will only use the blue box to calculate the green box's IOU.

If green and blue IOU Greater than the threshold we define 0.5, We will delete the blue box , Because of its low confidence , And there is obvious overlap .

Repeat this process for each box in the image , In the above example, only unique boxes with high confidence are finally generated . As shown below :

NMS Algorithm

Sum up the above process , We can get NMS Of The calculation process as follows :

Define confidence thresholds and IOU Threshold value .
Arrange the bounding boxes in descending order of confidence bounding_box
from bbox_list Delete the prediction box with confidence less than the threshold
Loop through the remaining boxes , First, select the box with the highest confidence as the candidate box .
Then, the values of all prediction frames and current candidate frames belonging to the same class as the candidate frames are calculated IOU.
If any of the above two boxes IOU The value is greater than IOU threshold , So from box_list Remove the prediction box with low confidence
Repeat this operation , Until all prediction boxes in the list are traversed .

Code implementation

def nms(boxes, conf_threshold = 0.7, iou_threshold = 0.4):
    bbox_list_thresholded = [] 
    bbox_list_new = []  
    boxes_sorted = sorted(boxes, reverse = True, key = lambda x: x[5]) 
    for box in boxes_sorted:
        if box[5] > conf_threshold: 
            bbox_list_thresholded.append(box)  
        else:
            pass
        
    while len(bbox_list_thresholded) > 0:
        current_box = bbox_list_thresholded.pop(0)  
        bbox_list_new.append(current_box)  
        for box in bbox_list_thresholded:
            if current_box[4] == box[4]: 
                iou = IOU(current_box[:4], box[:4])  
                if iou > iou_threshold: 
                    bbox_list_thresholded.remove(box)  
    return bbox_list_new

The explanation is as follows :

def nms(boxes, conf_threshold=0.7, iou_threshold=0.4):

This function lists the image candidate boxes 、 Confidence thresholds and iou Threshold as input .（ The corresponding default values are set to 0.7 and 0.4）

bbox_list_thresholded = []
bbox_list_new = []

Then we created two named bbox_list_threshold and bbox_list_new A list of .

bbox_list_threshold： Contains a list of new boxes after filtering low confidence boxes
bbox_list_new： Include execution NMS The final box list after

boxes_sorted = sorted(boxes, reverse=True, key = lambda x : x[5])

In the above definition of terms , Sort the list of boxes in descending order of confidence , And store the new list in the variable boxes_sorted in .

Here we use python The built-in sorted Function to sort it , This function is based on key Field specifies the collation .

In our case , We specify a keyword reverse=True To sort the list in descending order , At the same time, specify the constraint of the second keyword for sorting . Here we use lambda The function provides a mapping , Returns the... Of each bounding box 5 Elements （ Degree of confidence ）.

After setting the above two parameters , When traversing each box , The sorting function will sort the candidate boxes in descending order according to confidence .

for box in boxes_sorted:
    if box[5] > conf_threshold:
        bbox_list_thresholded.append(box)
    else:
        pass

We traverse all sorted boxes , And remove confidence below the threshold we set （conf_threshold=0.7） Box of

while len(bbox_list_thresholded) > 0:
    current_box = bbox_list_thresholded.pop(0)
    bbox_list_new.append(current_box)

In the above filter candidate box based on confidence , We iterate through the threshold box list one by one （bbox_list_threshold） All boxes in , Until the list is empty .

Let's first remove... From this list （ eject ） The first box （ Current box ）, Because it has the highest credibility , Then attach it to our final list （bbox_list_new）.

for box in bbox_list_thresholded:
    if current_box[4] == box[4]:
        iou = IOU(current_box[:4], box[:4])
        if iou > iou_threshold:
            bbox_list_thresholded.remove(box)

then , Let's iterate over the list bbox_list_threshold All remaining boxes in , And check whether they are the same as the current box category .（box[4] Corresponding to category ）

If two boxes belong to the same class , We calculate the distance between these boxes IOU, If IOU>IOU_threshold, We will start from the list bbox_list_thresholded Remove the box with low confidence .