当前位置：网站首页>Important concepts of target detection - IOU, receptive field, hole convolution, mAP

Important concepts of target detection - IOU, receptive field, hole convolution, mAP

2022-08-02 06:32:00 【The romance of cherry blossoms】

1.Calculation of IOU

The performance of the target detection algorithm can be judged by comparing the distance and overlapping area between the ground-truth bounding box and the predicted bounding box.IoU(Intersection over Union) is a criterion for the accuracy of detecting corresponding objects in a specific data set, which can achieve the above purpose.The calculation logic of IoU is very simple, that is, the area of the intersection of two boxes
divided by the area of the combination of the two boxes, as shown in the figure.Where does the so-called truth box come from?For any training dataset or test dataset, there are at least two types of data: one is the image itself; the other contains the coordinates of the object to be detected.Our goal is to build a target detection algorithm and test it on the test set. If the value of IoU is greater than 0.5, the test effect is generally good; if the value of IoU is less than 0.5, then it is obvious that the target detectionThe effect is not ideal.If the value of IoU is around 0.7, it means that the position of the ground truth frame and the prediction frame are very close; if the value of IoU is around 0.9, it means that the target detection algorithm is very effective on the test set.

2. Hole convolution

Common image segmentation algorithms usually use pooling layers and convolutional layers to increase the receptive field, and also reduce the size of the feature map, and then use upsampling to restore the image to the fullest size.Because the process of shrinking and re-enlarging the feature map causes loss of accuracy, it is necessary to keep the size of the feature map unchanged while increasing the receptive field, so as to replace the down-sampling and up-sampling operations.Atrous convolution was born.
Dilated/Atrous Convolation (can be called hole convolution or dilated convolution in Chinese) is to inject holes into the standard convolution map to increase the receptive field, compared with the original normalConvolution, atrous convolution adds an additional hyperparameter: dilation rate, which defines the spacing of each value when the convolution kernel processes the data (for example, the standard convolution dilation rate is 1).

      Hollow convolution has the following two functions.
    1) Expand the receptive field: In order to increase the receptive field and reduce the amount of computation in the deep network, downsampling (such as using a pooling layer) is always performed, although this can increase the receptive field, but also reduces the spatial resolution.Using atrous convolution can expand the receptive field without losing resolution as much as possible, which is especially important in detection or segmentation tasks.On the one hand, a larger receptive field can detect large targets, and on the other hand, a higher resolution can precisely locate the target.
    2) Capture multi-scale context information: When different dilation rates are set, the receptive field will be different, and multi-degree information can be obtained.Multi-scale information is very important in vision tasks.

3.mAP

Suppose the classifier has 90% precision at 10% recall, but 96% precision at 20% recall.There's really no tradeoff here: It's more reasonable to use a classifier with 20% recall (instead of 10%) because you'll get higher recall and higher precision.Therefore, we should not be looking at 10% recall, but at the maximum precision that the classifier can provide at least 10% recall.This is 96%, not 90%.So one way to get a reasonable idea about the performance of the model is to calculate the maximum precision that can be achieved when the recall is at least 0% (then 10%, 20%, and so on, up to 100%) and then calculateThe average of these maximum accuracies.This is called the Average Precision (AP) metric.When there are more than two classes, we can compute AP for each class and then compute the mean AP (mAP).
In an object detection system, there is another layer of complexity: what if the system detects the correct class but in the wrong location (i.e. the bounding box has no objects at all)?Of course, we should not take this as a positive forecast.The second way is to define an IOU threshold: for example, we can consider the prediction to be correct only if the IOU is too large for 05 and the predicted class is correct.The corresponding mAP is usually labeled as [email protected] (or [email protected]%, or sometimes AP50).This is done in some competitions (such as the PASCAL VOC Challenge).In other cases (such as COCO competition), mAP is calculated for different IOU readings (0.50, 0.55, 0.60, ..., 0.95), and the final metric is the average of all these mAPs (denoted as AP @[ .50:.95] or [email protected][.50:0.05:.95]).This is the average of the average.

原网站

版权声明
本文为[The romance of cherry blossoms]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/214/202208020511243728.html

当前位置：网站首页>Important concepts of target detection - IOU, receptive field, hole convolution, mAP

Important concepts of target detection - IOU, receptive field, hole convolution, mAP

1.Calculation of IOU

2. Hole convolution

3.mAP

边栏推荐

猜你喜欢

随机推荐