当前位置：网站首页>Record in detail the implementation of yolact instance segmentation ncnn

Record in detail the implementation of yolact instance segmentation ncnn

2022-06-27 09:42:00 【Xiaobai learns vision】

Click on the above “ Xiaobai studies vision ”, Optional plus " Star standard " or “ Roof placement ”

 Heavy dry goods , First time delivery

link ：https://zhuanlan.zhihu.com/p/128974102

This article reprints self-knowledge , The author has authorized , Do not reprint without permission .

0x0 YOLACT Instance segmentation

https://urlify.cn/rURFry

The end-to-end phase completes the instance segmentation
Fast ,550x550 Picture in TitanXP Claim to reach 33FPS
Open source code ,pytorch Dafa is good ！

0x1 reason

Throughout the github, Whether it's ncnn still ncnn Derivative projects , classification , testing , location , feature extraction ,OCR, Style change ....

However , No instance partition is found , That someone sent a issue, And asked by name to do YOLACT Instance segmentation https://github.com/Tencent/ncnn/issues/1679

Well, then write a YOLACT Example , By the way, how to use ncnn Implement algorithms like this that require post-processing

0x2 pytorch test

YOLACT In the project YOLACT++ Model , Faster , better , however YOLACT++ It uses a classic operation that is not friendly to deployment deformable convolution

Pretend not to see , Let's go download YOLACT Model

newly build weights Folder , download yolact_resnet50_54_800000.pth

according to README instructions , Take a picture to see the effect

$ python eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.15 --top_k=15 --image=test.jpg

0x3 Remove post-processing Export onnx

Directly modifying eval.py Of evalimage, Replace the result display with onnx export

def evalimage(net:Yolact, path:str, save_path:str=None):
    frame = torch.from_numpy(cv2.imread(path)).cuda().float()
    batch = FastBaseTransform()(frame.unsqueeze(0))
    preds = net(batch)


    torch.onnx._export(net, batch, "yolact.onnx", export_params=True, keep_initializers_as_inputs=True, opset_version=11)

according to YOLACT issue Information in ,yolact.py At the beginning JIT You have to turn it off to export onnx

# As of March 10, 2019, Pytorch DataParallel still doesn't support JIT Script Modules
use_jit = False

YOLACT The post-processing part is very pythonic, This direct export does not work , Remove post-processing from the model , Easy to export and convert

Even if onnx Can export post-processing , It's not recommended either

The post-treatment part is not standardized , The implementation details of each project author are also different , Such as a variety of nms and bbox Calculation method ,ncnn It's hard to use a unified op Realization (caffe-ssd Because there is only one version , So there is implementation )
Post processing in onnx Will be converted into a big lump of glue op, Very trivial , It is inefficient to implement in the framework
onnx Most of the glue op,ncnn Does not support or has compatibility problems , such as Gather etc. , Cannot be used directly

therefore , Remove post-processing Export onnx, Is the correct conversion pytorch ssd And so on

open yolact.py, find class Yolact Of forward Method , hold detect Process removal , Return directly to the model pred_outs Output

# return self.detect(pred_outs, self)
            return pred_outs;

Run the picture test again , Without post-processing yolact.onnx There is

$ python eval.py --trained_model=weights/yolact_resnet50_54_800000.pth --score_threshold=0.15 --top_k=15 --image=test.jpg

0x4 simplify onnx

Directly derived onnx The model has a lot of glue op yes ncnn Don't support , use onnx-simplifier It's a routine operation

$ pip install -U onnx --user
$ pip install -U onnxruntime --user
$ pip install -U onnx-simplifier --user


$ python -m onnxsim yolact.onnx yolact-sim.onnx

There is a problem at this time

Graph must be in single static assignment (SSA) form, however '523' has been used as output names multiple times

Passing through github Look over issue, Confirm this is onnx bug

https://link.zhihu.com/?target=https%3A//github.com/onnx/onnx/issues/2613

fortunately onnx-simplifier Means have been provided to bypass

$ python -m onnxsim --skip-fuse-bn yolact.onnx yolact-sim.onnx

0x5 ncnn Model transformation and optimization

The previous simplification onnx When ,--skip-fuse-bn Skip the batchnorm Merge , But that's okay ,ncnn It also has this function

ncnnoptimize The tool implements the fusion of many operators , For example, the common convolution-batchnorm-relu wait

Last parameter 0 Express fp32 Model ,65536 Means reduced to fp16 Model , It can reduce the binary volume of the model

$ ./onnx2ncnn yolact-sim.onnx yolact.param yolact.bin
$ ./ncnnoptimize yolact.param yolact.bin yolact-opt.param yolact-opt.bin 0

0x6 Fine tune the model manually

Or this sentence , Not reporting an error does not necessarily mean that it can be used , First use netron Tool open param Look at the model structure

There are four outputs of this model , It's framed in red

Convolution              Conv_263                 1 1 617 619 0=32 1=1 5=1 6=8192 9=1
Permute                  Transpose_265            1 1 619 620 0=3
UnaryOp                  Tanh_400                 1 1 814 815 0=16
Concat                   Concat_401               5 1 634 673 712 751 790 816 0=-3
Concat                   Concat_402               5 1 646 685 724 763 802 817 0=-3
Concat                   Concat_403               5 1 659 698 737 776 815 818 0=-3
Softmax                  Softmax_405              1 1 817 820 0=1 1=1

YOLACT The post-treatment of needs loc conf prior mask maskdim These things

At first, I can't see what these outputs correspond to , Let's see first shape

ncnn::Extractor ex = yolact.create_extractor();


ncnn::Mat in(550, 550, 3);
ex.input("input.1", in);


ncnn::Mat b620;
ncnn::Mat b816;
ncnn::Mat b818;
ncnn::Mat b820;
ex.extract("620", b620);// 32 x 138x138
ex.extract("816", b816);// 4 x 19248
ex.extract("818", b818);// 32 x 19248
ex.extract("820", b820);// 81 x 19248

Directly compile and run the discovery Concat layer crash, That is, the blue box in the figure ,Concat axis The parameter is negative 0=-3,ncnn Not yet

according to Concat Multiple inputs shape, It is found that the two-dimensional data is in h axis concat, Direct change to 0=0 Can replace

Concat                   Concat_401               5 1 634 673 712 751 790 816 0=0
Concat                   Concat_402               5 1 646 685 724 763 802 817 0=0
Concat                   Concat_403               5 1 659 698 737 776 815 818 0=0

b820 stay softmax Back , Be sure it is conf,shape 81x19248 Express 81 classification x 19248 individual prior

b816 shape 4x19248, Corresponds to each priorbox Of bbox Offset value

b818 shape 32x19248, according to YOLACT The post-processing of , It means maskdim, namely 32 The coefficient of a divided heat map

b620 shape 32x138x138, namely 32 A split heat map , There's a front. permute Layer is NCHW->NHWC Transformation prior No output in the model

ncnn Handle b620 NHWC shape inconvenient , Change it to extract permute Before NCHW data b619, That is, the green box in the figure outputs

ncnn::Extractor ex = yolact.create_extractor();


ncnn::Mat in(550, 550, 3);
ex.input("input.1", in);


ncnn::Mat maskmaps;
ncnn::Mat location;
ncnn::Mat mask;
ncnn::Mat confidence;
ex.extract("619", maskmaps);// 138x138 x 32
ex.extract("816", location);// 4 x 19248
ex.extract("818", mask);// maskdim 32 x 19248
ex.extract("820", confidence);// 81 x 19248

0x7 Generate prior

The original code is in yolact.py class PredictionModule make_priors, Add some print Get it all priorbox Generate rule hyperparameters

const int conv_ws[5] = {69, 35, 18, 9, 5};
const int conv_hs[5] = {69, 35, 18, 9, 5};


const float aspect_ratios[3] = {1.f, 0.5f, 2.f};
const float scales[5] = {24.f, 48.f, 96.f, 192.f, 384.f};

YOLACT Of prior The four values are center_x center_y box_w box_h, range 0~1

The author wrote a bug,box_h = box_w Fixed square , We also need to put this bug To reproduce

// make priorbox
ncnn::Mat priorbox(4, 19248);
{
    float* pb = priorbox;


    for (int p = 0; p < 5; p++)
    {
        int conv_w = conv_ws[p];
        int conv_h = conv_hs[p];


        float scale = scales[p];


        for (int i = 0; i < conv_h; i++)
        {
            for (int j = 0; j < conv_w; j++)
            {
                // +0.5 because priors are in center-size notation
                float cx = (j + 0.5f) / conv_w;
                float cy = (i + 0.5f) / conv_h;


                for (int k = 0; k < 3; k++)
                {
                    float ar = aspect_ratios[k];


                    ar = sqrt(ar);


                    float w = scale * ar / 550;
                    float h = scale / ar / 550;


                    // This is for backward compatability with a bug where I made everything square by accident
                    // cfg.backbone.use_square_anchors:
                    h = w;


                    pb[0] = cx;
                    pb[1] = cy;
                    pb[2] = w;
                    pb[3] = h;


                    pb += 4;
                }
            }
        }
    }
}

0x8 YOLACT Whole process realization

Pretreatment part

data/config.py Yes ImageNet Of MEAN STD,BGR The order

# These are in BGR and are for ImageNet
MEANS = (103.94, 116.78, 123.68)
STD   = (57.38, 57.12, 58.40)

YOLACT Actual input RGB, To change the order

const int target_size = 550;


int img_w = bgr.cols;
int img_h = bgr.rows;


ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR2RGB, img_w, img_h, target_size, target_size);


const float mean_vals[3] = {123.68f, 116.78f, 103.94f};
const float norm_vals[3] = {1.0/58.40f, 1.0/57.12f, 1.0/57.38f};
in.substract_mean_normalize(mean_vals, norm_vals);

Post processing part

This section and SSD Post processing is very similar ,sort nms These codes are boring ncnn/src/layer/detectionoutput.cpp

The only thing to pay attention to is bbox Generate and SSD Dissimilarity , Use center_x center_y box_w box_h Realization ,YOLACT The original code is layers/box_util.py decode function

YOLACT Yes fastnms Method layers/funstions/detection.py, Faster , But I think it's normal nms After all, it's off the shelf code , It works very well

// generate all candidates for each class
for (int i=0; i<num_priors; i++)
{
    // find class id with highest score
    // start from 1 to skip background


    // ignore background or low score
    if (label == 0 || score <= confidence_thresh)
        continue;


    // apply center_size to priorbox with loc
    float var[4] = {0.1f, 0.1f, 0.2f, 0.2f};


    float pb_cx = pb[0];
    float pb_cy = pb[1];
    float pb_w = pb[2];
    float pb_h = pb[3];


    float bbox_cx = var[0] * loc[0] * pb_w + pb_cx;
    float bbox_cy = var[1] * loc[1] * pb_h + pb_cy;
    float bbox_w = (float)(exp(var[2] * loc[2]) * pb_w);
    float bbox_h = (float)(exp(var[3] * loc[3]) * pb_h);


    float obj_x1 = bbox_cx - bbox_w * 0.5f;
    float obj_y1 = bbox_cy - bbox_h * 0.5f;
    float obj_x2 = bbox_cx + bbox_w * 0.5f;
    float obj_y2 = bbox_cy + bbox_h * 0.5f;


    // clip inside image


    // append object candidate
}


// merge candidate box for each class
for (int i=0; i<(int)class_candidates.size(); i++)
{
    // sort + nms
}


// sort all result by score


// keep_top_k

Split graph generation

maskmaps the truth is that 32 Zhang 138x138 Dimensional heat map , Each of the previous outputs object Have their own 32 individual float coefficient

object The split graph of is each heat graph * Corresponding coefficient , Sum up , Zoom in to original size , Two valued , Last crop inside Output box

unnatrual It's beautiful ！

0x9 Add learning materials

alas ？ There are also supplementary learning materials ？

ncnn The implementation code and the improved model have been uploaded to github

https://link.zhihu.com/?target=https%3A//github.com/Tencent/ncnn

The good news ！

Xiaobai learns visual knowledge about the planet

Open to the outside world

 download 1：OpenCV-Contrib Chinese version of extension module 

 stay 「 Xiaobai studies vision 」 Official account back office reply ： Extension module Chinese course , You can download the first copy of the whole network OpenCV Extension module tutorial Chinese version , Cover expansion module installation 、SFM Algorithm 、 Stereo vision 、 Target tracking 、 Biological vision 、 Super resolution processing and other more than 20 chapters .


 download 2：Python Visual combat project 52 speak 
 stay 「 Xiaobai studies vision 」 Official account back office reply ：Python Visual combat project , You can download, including image segmentation 、 Mask detection 、 Lane line detection 、 Vehicle count 、 Add Eyeliner 、 License plate recognition 、 Character recognition 、 Emotional tests 、 Text content extraction 、 Face recognition, etc 31 A visual combat project , Help fast school computer vision .


 download 3：OpenCV Actual project 20 speak 
 stay 「 Xiaobai studies vision 」 Official account back office reply ：OpenCV Actual project 20 speak , You can download the 20 Based on OpenCV Realization 20 A real project , Realization OpenCV Learn advanced .


 Communication group 

 Welcome to join the official account reader group to communicate with your colleagues , There are SLAM、 3 d visual 、 sensor 、 Autopilot 、 Computational photography 、 testing 、 Division 、 distinguish 、 Medical imaging 、GAN、 Wechat groups such as algorithm competition （ It will be subdivided gradually in the future ）, Please scan the following micro signal clustering , remarks ：” nickname + School / company + Research direction “, for example ：” Zhang San  +  Shanghai Jiaotong University  +  Vision SLAM“. Please note... According to the format , Otherwise, it will not pass . After successful addition, they will be invited to relevant wechat groups according to the research direction . Please do not send ads in the group , Or you'll be invited out , Thanks for your understanding ~

原网站

版权声明
本文为[Xiaobai learns vision]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/178/202206270937213070.html

当前位置：网站首页>Record in detail the implementation of yolact instance segmentation ncnn

Record in detail the implementation of yolact instance segmentation ncnn

Pretreatment part

Post processing part

Split graph generation

边栏推荐

猜你喜欢

随机推荐