当前位置：网站首页>Understanding of related concepts of target detection

Understanding of related concepts of target detection

2022-06-10 21:38:00 【The mountain of ignorance, the valley of despair, the slope of 】

Definition of target detection

First, what is classification , What is return ？
Classification and regression are both supervised learning , Predict the input data .
The output of the classification is discrete , Is the category to which the object belongs , Like a cat 、 Dogs, etc .
The output of regression is continuous , Is the value of the object , Within a certain range .
It seems that there is a highly praised answer [1] say ： Continuity and discreteness are representations , The essential difference is whether the output label has a distance measurement .
The classification task has no distance measurement , hold 1 Classified as 2 And put 1 Classified as 3 There is no difference between .
Regression tasks have distance measures , The real price of coke is 5 element , Forecast as 4 element , Error is 1 element , Forecast as 2 element , Error is 3 element .
Besides , The purpose of classification is to find the decision boundary , Get a decision-making face , Classify the data in the data set . For example, judge whether the animal in the picture is a cat or a dog .
The purpose of regression is to find the best fit , Get an optimal fitting line , This line should be close to every point in the data set . Such as forecasting stocks 、 Forecast house prices, etc .

Image classification 、 object detection 、 Image segmentation understanding

Image classification (image classification)： The input image often contains only one object , The purpose is to determine what object each image is , It's an image level task , Relatively simple , The development is also the fastest .
object detection (object detection)： There are often many objects in the input image , The purpose is to judge the position and category of the object , It is a very core task in computer vision .
Image segmentation (image segmentation)： Input is similar to object detection , Determine which category each pixel belongs to , It belongs to pixel level classification . There are many connections between image segmentation and object detection tasks , Models can also learn from each other .

bounding box The location of

There are usually three formats to represent bounding box The location of ：
xyxy, namely (x1, y1, x2, y2), among (x1, y1) yes bounding box Coordinates of the upper left corner ,(x2,y2) yes bounding box The coordinates of the lower right corner ;
xywh, namely (x, y, w, h), among (x, y) yes bounding box Coordinates of the upper left corner ,w It's the width of the rectangle ,h It's the height of the rectangle ;
cxcywh, namely (cx, cy, w, h), among (cx, cy) yes bounding box The coordinates of the center point ,w It's the width of the rectangle ,h It's the height of the rectangle .

In the detection task , The label of the training data set will give the true boundary box of the target object (x1,y1,x2,y2), Such a bounding box is also called a real box （ground truth box）, Our trained model can predict the possible position of the target object , The bounding box predicted by the model is called the prediction box （prediction box）. To complete a test task , We usually hope that the model can be based on the input image , Output some predicted bounding boxes , And the category of objects contained in the bounding box or the probability of belonging to a certain category , For example, this format : [L, P, x1, y1, x2, y2], among L It's a category label ,P It's the probability that the object belongs to that category . A single input image may produce multiple prediction boxes , We'll just predict prediction box and ground truth box Calculate the loss value to define the loss function .

NMS The understanding of the

Reference resources ：https://zhuanlan.zhihu.com/p/80318430

1, First, from the first category dog Start , Will all dog score <thresh1（0.3） Of bb Of score Value is set to 0

Insert picture description here

2, Then follow the current dog score Value to all bb Sort ：

Insert picture description here

3, After ranking, we find the current highest score 0.7 Corresponding to it bb98（ The red arrow ）, In order to describe the whole process more clearly , Let's do it alone dog score This line comes out , Then we count the rest bb and bb98 Of IOU：

Insert picture description here
Actually , We didn't just set up score threshold , And set up IOU threshold , higher than IOU Threshold , To be deleted .

4, When the calculation is finished bb98 And the rest bb Of IOU The value of , We can delete a part bb（ Set to zero ）, What we never deleted after bb Select the current maximum value in , namely 0.4, The corresponding is bb1, And then calculate bb1 And the rest of the bb Of IOU value ：

Insert picture description here

After that , You'll get one score list , Take the box corresponding to the maximum value .

5, When it's done dog After this category , Let's deal with the next category , for example bike, Same process as above , Then do the same operation for each class , We deleted most of the bb, And for what remains bb, Draw the corresponding box .

Be careful ,NMS Algorithms are usually used in the test phase

原网站

版权声明
本文为[The mountain of ignorance, the valley of despair, the slope of ]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/161/202206102025582004.html

当前位置：网站首页>Understanding of related concepts of target detection

Understanding of related concepts of target detection

Definition of target detection

Image classification 、 object detection 、 Image segmentation understanding

bounding box The location of

NMS The understanding of the

1, First, from the first category dog Start , Will all dog score <thresh1（0.3） Of bb Of score Value is set to 0

2, Then follow the current dog score Value to all bb Sort ：

3, After ranking, we find the current highest score 0.7 Corresponding to it bb98（ The red arrow ）, In order to describe the whole process more clearly , Let's do it alone dog score This line comes out , Then we count the rest bb and bb98 Of IOU：

4, When the calculation is finished bb98 And the rest bb Of IOU The value of , We can delete a part bb（ Set to zero ）, What we never deleted after bb Select the current maximum value in , namely 0.4, The corresponding is bb1, And then calculate bb1 And the rest of the bb Of IOU value ：

5, When it's done dog After this category , Let's deal with the next category , for example bike, Same process as above , Then do the same operation for each class , We deleted most of the bb, And for what remains bb, Draw the corresponding box .

边栏推荐

猜你喜欢

随机推荐