当前位置：网站首页>A few lines of code can easily realize the real-time reasoning of paddleocr. Come and get!

A few lines of code can easily realize the real-time reasoning of paddleocr. Come and get!

2022-07-28 00:27:00 【Intel edge computing community】

It's hot in summer , Zhengyi learning

In the last issue, teacher Wu Zhuo taught us the tips of filling in bills with one click

Liberated hands Nono I have to lament

OCR Technology is so practical ！

So here comes the question ,

Can we use our side CPU Use this technology at any time ？

Teacher Wu Zhuo is right Nono Showed a positive smile

Still just a few lines of code

If you don't believe it, look down

01 Introduction to this course

This course is mainly based on Baidu open source PaddlePaddle Framework of the PaddleOCR Technology as an example , How to use Intel's open source OpenVINO Tool set , Only use what we have at hand CPU, Easy implementation for PaddleOCR Real time reasoning .

mention PaddleOCR, I have to mention its technical principle .PaddleOCR The workflow is shown in the figure below , It mainly includes text detection 、 Direction classification 、 And text recognition .

02 reasoning process

Text detection task , It refers to finding the position of text in an image or video , It doesn't need to be like the target detection task , Not only to solve the positioning problem , Also solve the problem of target classification . However , Text detection also faces some difficulties , such as ： Texts in natural scenes are diverse , Text size 、 Direction 、 length 、 shape 、 Languages are different , The text overlaps or has a high density , These will affect the effect of final text detection . At present, the commonly used text detection methods include regression based and segmentation based methods . And in the PaddleOCR in , What we chose is based on segmentation DBNet Method .

DBNet How it works is shown in the figure above .

generally speaking , The image will pass through the feature pyramid fpn Network structure obtained 4 A feature map , They are... Of the original image 1/4、1/8、1/16 and 1/32 Size . then , We will 4 Feature maps are up sampled as 1/4 Size , Proceed again concatenation You will get the characteristic map f. Next, the characteristic diagram f Get our probability diagram and threshold diagram . Finally, through the probability diagram p And threshold graph t The calculation of , An approximate binary graph can be obtained by differential binarization .

This method based on segmentation needs to use threshold for binarization, which will lead to a long time after processing , So what we're using is DBNet In view of this phenomenon, a method of learning threshold is proposed , A binarization function approximating the step function is cleverly designed , So that the segmentation network can learn the threshold of text segmentation end-to-end during training . This automatic threshold adjustment can not only improve the accuracy , At the same time, it can simplify post-processing , It will further improve the performance of text detection .

03 Specific code implementation

Talk on paper and you'll never know , We must know that we must do it . Let's import everything we need first Python tool kit , By viewing the operation practice of specific code OpenVINO about PaddleOCR The reasoning of the text detection model .

Before reasoning , First, you need to download the model . To make it easy to download , We define a model download function , Then specify the download path . In this course, we use PaddleOCR mobile Such a small model .

Then read the model , And load it into the device CPU The above to . Just use two lines of very simple code like the following figure , You can read and load the model .

Next , You need to define variable names and preprocessing 、 Post processing functions . We need to input and output such a variable name when carrying out the text detection task . Before we formally carry out the reasoning task of text detection , We also need to define some necessary pre-processing and post-processing functions , Finally, we can visually see and detect the target text on a picture and complete the conversion process .

Last , We only need to use such a line of code , You can use it OpenVINO To infer the text detection model , So what is the effect of reasoning ？

Reasoning effect verification

The figure above shows the reasoning results of text detection in this course , You can see that the text detection and positioning of this printed text is clear and accurate , Accurate positioning . What if it's handwriting ？ We can change a picture to have a look .