当前位置:网站首页>Important concepts of target detection - IOU, receptive field, hole convolution, mAP
Important concepts of target detection - IOU, receptive field, hole convolution, mAP
2022-08-02 06:32:00 【The romance of cherry blossoms】
1.Calculation of IOU
The performance of the target detection algorithm can be judged by comparing the distance and overlapping area between the ground-truth bounding box and the predicted bounding box.IoU(Intersection over Union) is a criterion for the accuracy of detecting corresponding objects in a specific data set, which can achieve the above purpose.The calculation logic of IoU is very simple, that is, the area of the intersection of two boxes
divided by the area of the combination of the two boxes, as shown in the figure.Where does the so-called truth box come from?For any training dataset or test dataset, there are at least two types of data: one is the image itself; the other contains the coordinates of the object to be detected.Our goal is to build a target detection algorithm and test it on the test set. If the value of IoU is greater than 0.5, the test effect is generally good; if the value of IoU is less than 0.5, then it is obvious that the target detectionThe effect is not ideal.If the value of IoU is around 0.7, it means that the position of the ground truth frame and the prediction frame are very close; if the value of IoU is around 0.9, it means that the target detection algorithm is very effective on the test set.

2. Hole convolution
Common image segmentation algorithms usually use pooling layers and convolutional layers to increase the receptive field, and also reduce the size of the feature map, and then use upsampling to restore the image to the fullest size.Because the process of shrinking and re-enlarging the feature map causes loss of accuracy, it is necessary to keep the size of the feature map unchanged while increasing the receptive field, so as to replace the down-sampling and up-sampling operations.Atrous convolution was born.
Dilated/Atrous Convolation (can be called hole convolution or dilated convolution in Chinese) is to inject holes into the standard convolution map to increase the receptive field, compared with the original normalConvolution, atrous convolution adds an additional hyperparameter: dilation rate, which defines the spacing of each value when the convolution kernel processes the data (for example, the standard convolution dilation rate is 1).

Hollow convolution has the following two functions.
1) Expand the receptive field: In order to increase the receptive field and reduce the amount of computation in the deep network, downsampling (such as using a pooling layer) is always performed, although this can increase the receptive field, but also reduces the spatial resolution.Using atrous convolution can expand the receptive field without losing resolution as much as possible, which is especially important in detection or segmentation tasks.On the one hand, a larger receptive field can detect large targets, and on the other hand, a higher resolution can precisely locate the target.
2) Capture multi-scale context information: When different dilation rates are set, the receptive field will be different, and multi-degree information can be obtained.Multi-scale information is very important in vision tasks.
3.mAP
Suppose the classifier has 90% precision at 10% recall, but 96% precision at 20% recall.There's really no tradeoff here: It's more reasonable to use a classifier with 20% recall (instead of 10%) because you'll get higher recall and higher precision.Therefore, we should not be looking at 10% recall, but at the maximum precision that the classifier can provide at least 10% recall.This is 96%, not 90%.So one way to get a reasonable idea about the performance of the model is to calculate the maximum precision that can be achieved when the recall is at least 0% (then 10%, 20%, and so on, up to 100%) and then calculateThe average of these maximum accuracies.This is called the Average Precision (AP) metric.When there are more than two classes, we can compute AP for each class and then compute the mean AP (mAP).
In an object detection system, there is another layer of complexity: what if the system detects the correct class but in the wrong location (i.e. the bounding box has no objects at all)?Of course, we should not take this as a positive forecast.The second way is to define an IOU threshold: for example, we can consider the prediction to be correct only if the IOU is too large for 05 and the predicted class is correct.The corresponding mAP is usually labeled as [email protected] (or [email protected]%, or sometimes AP50).This is done in some competitions (such as the PASCAL VOC Challenge).In other cases (such as COCO competition), mAP is calculated for different IOU readings (0.50, 0.55, 0.60, ..., 0.95), and the final metric is the average of all these mAPs (denoted as AP @[ .50:.95] or [email protected][.50:0.05:.95]).This is the average of the average.
边栏推荐
- 提高软件测试能力的方法有哪些?看完这篇文章让你提升一个档次
- apisix-Getting Started
- leetcode 204. Count Primes 计数质数 (Easy)
- Install and use Google Chrome
- 对node工程进行压力测试与性能分析
- C语言中i++和++i在循环中的差异性
- The original question on the two sides of the automatic test of the byte beating (arranged according to the recording) is real and effective 26
- 面试官:设计“抖音”直播功能测试用例吧
- 【解决】RESP.app 连接不上redis
- el-input can only input integers (including positive numbers, negative numbers, 0) or only integers (including positive numbers, negative numbers, 0) and decimals
猜你喜欢

线程基础(一)

100 latest software testing interview questions in 2022, summary of common interview questions and answers

leetcode每天5题-Day04
![[C language] LeetCode26. Delete duplicates in an ordered array && LeetCode88. Merge two ordered arrays](/img/eb/9b05508e88b7f17d80de2afa8c08ce.png)
[C language] LeetCode26. Delete duplicates in an ordered array && LeetCode88. Merge two ordered arrays

There are more and more talents in software testing. Why are people still reluctant to take the road of software testing?

Browser onload event

淘系资深工程师整理的300+项学习资源清单(2021最新版)

深度学习——CNN实现MNIST手写数字的识别

腾讯大咖分享 | 腾讯Alluxio(DOP)在金融场景的落地与优化实践

Shuttle + Alluxio 加速内存Shuffle起飞
随机推荐
Packaging and deployment of go projects
Say good woman programmers do testing have an advantage?More than a dozen interview, abuse of cry ~ ~ by the interviewer
leetcode一步解决链表合并问题
ApiPost is really fragrant and powerful, it's time to throw away Postman and Swagger
C语言中i++和++i在循环中的差异性
国际顶会OSDI首度收录淘宝系统论文,端云协同智能获大会主旨演讲推荐
el-input can only input integers (including positive numbers, negative numbers, 0) or only integers (including positive numbers, negative numbers, 0) and decimals
What do interview test engineers usually ask?The test supervisor tells you
ERROR 1045 (28000) Access denied for user 'root'@'localhost'Solution
leetcode每天5题-Day04
LeetCode brush topic series - 787 K station transfer within the cheapest flight
Polar Parametrization for Vision-based Surround-View 3D Detection 论文笔记
淘系资深工程师整理的300+项学习资源清单(2021最新版)
Browser onload event
Detailed explanation of the software testing process (mind map) of the first-tier manufacturers
51单片机外设篇:ADC
字节面试题:如何保证缓存和数据库的一致性
51单片机外设篇:DS18B20
[PSQL] 窗口函数、GROUPING运算符
自动化运维工具——ansible、概述、安装、模块介绍