当前位置:网站首页>Understanding of related concepts of target detection
Understanding of related concepts of target detection
2022-06-10 21:38:00 【The mountain of ignorance, the valley of despair, the slope of 】
Definition of target detection
First, what is classification , What is return ?
Classification and regression are both supervised learning , Predict the input data .
The output of the classification is discrete , Is the category to which the object belongs , Like a cat 、 Dogs, etc .
The output of regression is continuous , Is the value of the object , Within a certain range .
It seems that there is a highly praised answer [1] say : Continuity and discreteness are representations , The essential difference is whether the output label has a distance measurement .
The classification task has no distance measurement , hold 1 Classified as 2 And put 1 Classified as 3 There is no difference between .
Regression tasks have distance measures , The real price of coke is 5 element , Forecast as 4 element , Error is 1 element , Forecast as 2 element , Error is 3 element .
Besides , The purpose of classification is to find the decision boundary , Get a decision-making face , Classify the data in the data set . For example, judge whether the animal in the picture is a cat or a dog .
The purpose of regression is to find the best fit , Get an optimal fitting line , This line should be close to every point in the data set . Such as forecasting stocks 、 Forecast house prices, etc .
Image classification 、 object detection 、 Image segmentation understanding
Image classification (image classification): The input image often contains only one object , The purpose is to determine what object each image is , It's an image level task , Relatively simple , The development is also the fastest .
object detection (object detection): There are often many objects in the input image , The purpose is to judge the position and category of the object , It is a very core task in computer vision .
Image segmentation (image segmentation): Input is similar to object detection , Determine which category each pixel belongs to , It belongs to pixel level classification . There are many connections between image segmentation and object detection tasks , Models can also learn from each other .
bounding box The location of
There are usually three formats to represent bounding box The location of :
xyxy, namely (x1, y1, x2, y2), among (x1, y1) yes bounding box Coordinates of the upper left corner ,(x2,y2) yes bounding box The coordinates of the lower right corner ;
xywh, namely (x, y, w, h), among (x, y) yes bounding box Coordinates of the upper left corner ,w It's the width of the rectangle ,h It's the height of the rectangle ;
cxcywh, namely (cx, cy, w, h), among (cx, cy) yes bounding box The coordinates of the center point ,w It's the width of the rectangle ,h It's the height of the rectangle .
In the detection task , The label of the training data set will give the true boundary box of the target object (x1,y1,x2,y2), Such a bounding box is also called a real box (ground truth box), Our trained model can predict the possible position of the target object , The bounding box predicted by the model is called the prediction box (prediction box). To complete a test task , We usually hope that the model can be based on the input image , Output some predicted bounding boxes , And the category of objects contained in the bounding box or the probability of belonging to a certain category , For example, this format : [L, P, x1, y1, x2, y2], among L It's a category label ,P It's the probability that the object belongs to that category . A single input image may produce multiple prediction boxes , We'll just predict prediction box and ground truth box Calculate the loss value to define the loss function .
NMS The understanding of the
Reference resources :https://zhuanlan.zhihu.com/p/80318430
1, First, from the first category dog Start , Will all dog score <thresh1(0.3) Of bb Of score Value is set to 0

2, Then follow the current dog score Value to all bb Sort :

3, After ranking, we find the current highest score 0.7 Corresponding to it bb98( The red arrow ), In order to describe the whole process more clearly , Let's do it alone dog score This line comes out , Then we count the rest bb and bb98 Of IOU:

Actually , We didn't just set up score threshold , And set up IOU threshold , higher than IOU Threshold , To be deleted .
4, When the calculation is finished bb98 And the rest bb Of IOU The value of , We can delete a part bb( Set to zero ), What we never deleted after bb Select the current maximum value in , namely 0.4, The corresponding is bb1, And then calculate bb1 And the rest of the bb Of IOU value :

After that , You'll get one score list , Take the box corresponding to the maximum value .
5, When it's done dog After this category , Let's deal with the next category , for example bike, Same process as above , Then do the same operation for each class , We deleted most of the bb, And for what remains bb, Draw the corresponding box .
Be careful ,NMS Algorithms are usually used in the test phase
边栏推荐
- Leetcode advanced road - 69 Square root of X
- Read the source code of micropyton - add the C extension class module (2)
- Redis集群配置
- C language -- 7 operators
- Read the source code of micropyton - add the C extension class module (4)
- C language ---6 first knowledge of selection statement, loop statement, function and array
- 简解深度学习Attention
- Brute force method / adjacency table depth first directed weighted graph undirected weighted graph
- 信号与系统复习1
- Mba-day21 linear programming problem
猜你喜欢

Acl2022 | bert2bert: an efficient pre training method of parameter reuse, which significantly reduces the training cost of oversized models

Fast Planner - detailed explanation of kinetic astar

Course design of imitation pottery ticket of wechat applet

Calculus review 1

实用 | 如何利用 Burp Suite 进行密码爆破!

C language ---9 first knowledge of macros and pointers

App test case

信号与系统复习1

登堂入室之soc开发环境及硬件开发准备

关于type-c
随机推荐
标配双安全气囊,价格屠夫长安Lumin 4.89万起售
数据库系统概论 ---- 第一章 -- 绪论(重要知识点)
Cas de test app
蛮力法/u到v是否存在简单路径
北大青鸟昌平校区:高中学历可以学UI吗?
Leetcode advanced road - 167 Sum of two numbers II - input ordered array
^29 event cycle model
Naturalspeech model synthetic speech achieves human speech level for the first time in CMOS test
蛮力法/邻接表 深度优先 有向带权图 无向带权图
Test APK exception control netlocation attacker development
蛮力法/1~n的全排列 v3 递归
LeetCode 进阶之路 - 反转字符串
Tableau auto - fabriqué
从h264实时流中提取Nalu单元数据
The programmed navigation route jumps to the current route (the parameters remain unchanged), and the navigationduplicated warning error will be thrown if it is executed multiple times?
获取的网络时间 + 时区(+8)
^30h5 web worker multithreading
登堂入室之soc开发环境及硬件开发准备
Use DAP link to download the executable file separately to the mm32f5 microcontroller
Theoretical basis of distributed services