当前位置:网站首页>【CV-Learning】Object Detection & Instance Segmentation

【CV-Learning】Object Detection & Instance Segmentation

2022-08-04 06:06:00 Xiao Liang has to work hard

目标检测

单目标检测

在这里插入图片描述

训练思路:一般分三个阶段,First train the classification(Generally, a ready-made model is used),Retrain positioning,Then train the classification together+定位.
Ps:目标检测中,Networks are generally not trained from scratch,而是使用ImageNet上预训练的模型.
多任务损失:The goal of network training is to reduce the total loss,所以 softmax loss 和 L2 loss will decrease at the same time,也可以为 softmax loss 和 L2 loss Set a weight separately,通过改变权重,调整 softmax loss 和 L2 loss share of total losses.
姿态估计:Label key points on the human body,然后通过训练,Compare with the answer.

多目标检测

思考:The responses of the neural network are pre-established,Because the number of targets in multi-target detection is not certain,The dimensions of the output are undefined,就无法建立Correct box标答,If the training method of single target detection is used,Unable to establish expression for multi-target detection,Training will not be available.

滑动窗口

思路:将图像中all possible areasAll are given to convolutional neural networks for classification,Only windows that can be correctly classified are left.
在这里插入图片描述
注:This can only be done if the classifier is fast enough,such as face recognitionAdaboost进行穷举.

R-CNN

思考:Aiming at the problem that the neural network classification of all regions of the exhaustive image requires a large amount of calculation,提出了一种新的思想,First generate some candidate regions from the image and then classify,rather than exhaustively enumerating all regions in the image.例如:区域建议 selective search.
思路
1.Use region proposals to generate regions of interest.(存入硬盘)
2.对区域进行缩放.
3.Feed the image region into a convolutional network(可以直接使用ResNet)进行特征提取.(存入硬盘)
4.Regions are classified using support vector machines,At the same time, bounding box regression is performed(Correct learning).在这里插入图片描述
边界框回归Bbox reg:The region where the region suggestion is generated,可能有损失,效果不好,进行边界框回归,It is to correct the deviation between the region generated by the region proposal and the real region.
问题:计算效率低下,不能进行使用.

Fast R-CNN

思路
1.Feature extraction is performed on the full image using convolutional networks.
2.Regions of interest are generated using region proposal methods.
3.area of ​​interest(特征)进行裁剪+缩放处理.
4.Classification by fully connected neural network.
在这里插入图片描述
改进
1.The features are extracted first and then the regions are proposed:If the region proposal is performed first, the feature extraction is performed,计算量比较大.
2.采用全连接神经网络
3.裁剪+缩放特征(RoI Pool)

区域裁剪(Rol Pool)

思路
1.Project the candidate region onto the feature map
2.Conforms region vertices to mesh intersections(The treated area will be slightly misaligned)
3.Divide it roughly into equal areasn*n个子区域
(nDetermined by the final desired feature map size)
4.Perform max pooling on each subregion
在这里插入图片描述

区域裁剪(Rol Align)

思路
1.Project the candidate region onto the feature map
2.Conforms region vertices to mesh intersections(No trimming operation is performed)
3.on each gridSpecification取四个点,Bilinear interpolation is performed for each point in the surrounding four grids(Assign different weights to points at different distances)
4.Perform max pooling on each subregion
在这里插入图片描述

R-CNN vs Fast R-CNN

在这里插入图片描述

问题:It can be seen in the image on the right,Candidate region generation process(区域建议)耗时过长,Almost equal to the detection time of a single image.

Faster R-CNN

突破点:Let the convolutional neural network generate the candidate regions by itself.
RPN
1.Input the result of feature extraction on the whole image using the convolutional networkRPN
2.The result is judged by the fully connected neural networkobject
在这里插入图片描述
损失联合训练
1.RPN分类损失(目标/非目标)
2.RPNBounding box coordinate regression loss
3.Candidate region classification loss
4.The final bounding box coordinate regression loss
思路:Two-stage object detector
在这里插入图片描述

区域候选网络(Region Proposal Network)

前言:经典的检测方法生成检测框都非常耗时,如OpenCV adaboost使用滑动窗口+图像金字塔生成检测框;或如R-CNN使用Selective Search方法生成检测框.而Faster R-CNN直接使用RPN生成检测框,这也是Faster R-CNN的巨大优势,能极大提升检测框的生成速度.
锚点(anchor):选择锚点,Determine whether the area centered on the anchor point contains a certain category.
在这里插入图片描述
给予一个anchor后,Perform a regression and return a deviation,Corrected regions to make regions more accurate.
在这里插入图片描述
实际使用时,corresponds to each location on each feature map,我们通常会采用kAnchor regions of different sizes and resolutions,Predict at an anchor pointk种可能性,This increases the predictive power of a point.
将k * 20 * 15的boxesSort by category score,选取前300as our candidate region.
在这里插入图片描述

一阶段目标检测器

1.yoloNo regional recommendations are made.
在这里插入图片描述
2.SSDClassify each layer,每层都使用anchor特性,The features of each layer are synthesized and finally multi-layer features are used.

影响检测精度的因素

在这里插入图片描述

实例分割

在Faster-RCN的基础上加上Mask Prediction即可.
在这里插入图片描述
It is obtained by the process of upsampling through a convolutionMask.
在这里插入图片描述
实例分割结果
在这里插入图片描述

姿态检测:Keypoint detection can be regressed after the first convolution.
在这里插入图片描述

Good implementations on GitHub!

TensorFlow Detection API:Faster RCNN, SSD, RFCN, Mask R-CNN

原网站

版权声明
本文为[Xiao Liang has to work hard]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/216/202208040525356975.html