当前位置:网站首页>[mask RCNN] target detection and recognition based on mask RCNN
[mask RCNN] target detection and recognition based on mask RCNN
2022-06-30 06:43:00 【FPGA and MATLAB】
Mask-RCNN He Kaiming Faster-RCNN After another masterpiece , It integrates object detection and instance segmentation , And the performance is better than Faster-RCNN. Its basic structure is as follows :

Mask R-CNN Is an instance segmentation model , It can determine the position and category of each target in the picture , Give pixel level prediction . So-called “ Instance segmentation ”, It refers to the segmentation of each interest object in the scene , Whether they belong to the same category or not —— For example, the model can be seen from the street view video Identify the vehicle in 、 Individual objectives such as personnel . The picture below is in COCO Trained on the dataset Mask R-CNN, As shown in the figure , Big enough for every car , Small to a single banana , It can mark the pixel position of the object in the picture with a window .
differ Faster R-CNN Such a classical object detection model ,Mask R-CNN One of the features of windows is that it can color the pixels that represent the outline of the object in the window . Some people may think this is a chicken rib function , But it's right. Autopilot The control of cars and robots is of great significance :
Shading can help the car to identify the specific pixel position of each target on the road , So as to avoid collision ;
If a robot wants to grab a target object , It needs to know the location information ( Such as Amazon Of Unmanned aerial vehicle (uav) ).
If you just want to COCO Training Mask R-CNN Model , The easiest way is to call Tensorflow Object Detection API, The specific content Github There are , No more details here .
Mask R-CNN How it works
In the build Mask R-CNN Before the model , Let's first understand its working mechanism .
in fact ,Mask R-CNN yes Faster R-CNN and FCN The combination of , The former is responsible for object detection ( Category labels + window ), The latter is responsible for determining the target contour . As shown in the figure below :
Its concept is very simple : For each target object ,Faster R-CNN Both have two outputs , First, classification labels , The second is the candidate window ; To segment the target pixel , We can add a third output to the first two —— Binary mask indicating the pixel position of the object in the window (mask). Different from the first two outputs , This new output needs to extract a finer spatial layout , So ,Mask R-CNN stay Faster-RCNN Add a branch network on :Fully Convolution Networ(FCN).
FCN Is a popular semantic segmentation algorithm , So called semantic segmentation , That is, the machine automatically divides the object area from the image , And identify the contents . The model first compresses the input image to the original size by convolution and maximum pooling layer 1/32, Then the classification prediction is carried out at this fine-grained level . Last , It uses up sampling and deconvolution The layer restores the graph to its original size .
So in short , We can say Mask R-CNN Combining two networks —— hold Faster R-CNN and FCN Into the same Mega Architecture . The loss function of the model calculates the classification 、 Generate window 、 Total loss of mask generation .
because MASK-RCNN Longer training time , We use matlab Provided after training MASK-RCNN To test , download
https://www.mathworks.com/supportfiles/vision/data/maskrcnn_pretrained_person_car.mat
This one is trained to recognize vehicles and pedestrians MASK-RCNN Model . Trained models , The data are as follows :

targetSize = [700 700 3];
imgSize = size(img);
[~, maxDim]= max(imgSize);
resizeSize = [NaN NaN];
resizeSize(maxDim) = targetSize(maxDim);
img = imresize(img, resizeSize);
trainSize = [800 800 3];
classNames = {'person','car','background'};
numClasses = 2;
params = createMaskRCNNConfig(trainSize, numClasses, classNames);
Envs = "cpu";
maskSubnet = helper.extractMaskNetwork(net);
%MaskRCNN
[boxes, scores, labels, masks] = detectMaskRCNN(net, maskSubnet, img, params, Envs);
if(isempty(masks))
overlayedImage = img;
else
overlayedImage = insertObjectMask(img, masks);
end
figure
imshow(overlayedImage)
showShape("rectangle", gather(boxes), "Label", labels, "LineColor",'g')adopt MATLAB Simulation , The following simulation results can be achieved :


For resources .
边栏推荐
- Collections tool class (V)
- 圖像處理7-圖像增强
- Never forget the original intention, and be lazy if you can: C # operate word files
- Keil - the "trace HW not present" appears during download debugging
- IO streams (common streams)
- 力扣------替换空格
- Installing googleplay environment on Huawei mobile phones
- Suggestion: use tools:overrideLibrary
- 【Mask-RCNN】基于Mask-RCNN的目标检测和识别
- RT thread Kernel Implementation (II): critical area, object container
猜你喜欢
随机推荐
1.2 (supplementary)
随机网络,无标度网络,小世界网络以及NS小世界的性能对比matlab仿真
Collection and method of traversing collection elements (1)
Gazebo model modification
Imxq Freescale yocto project compilation record
[my creation anniversary] one year anniversary essay
Uniapp wechat applet returns to the previous page and refreshes
1.7 - CPU performance indicators
Getting started with research
KEIL - 下载调试出现“TRACE HW not present”
1.4 - fixed and floating point numbers
Ffmplay is not generated during the compilation and installation of ffmpeg source code
Gazebo installation, uninstall and upgrade
01. regular expression overview
Wuenda coursera deep learning course
Image processing 7- image enhancement
File Transfer Protocol,FTP文件共享服务器
Software tools_ Shortcut_ Operation summary
Unclear about glide loading picture
1.6 - CPU composition









