当前位置:网站首页>A few lines of code can easily realize the real-time reasoning of paddleocr. Come and get!
A few lines of code can easily realize the real-time reasoning of paddleocr. Come and get!
2022-07-28 00:27:00 【Intel edge computing community】
It's hot in summer , Zhengyi learning
In the last issue, teacher Wu Zhuo taught us the tips of filling in bills with one click
Liberated hands Nono I have to lament
OCR Technology is so practical !
So here comes the question ,
Can we use our side CPU Use this technology at any time ?

Teacher Wu Zhuo is right Nono Showed a positive smile
Still just a few lines of code
If you don't believe it, look down

01 Introduction to this course
This course is mainly based on Baidu open source PaddlePaddle Framework of the PaddleOCR Technology as an example , How to use Intel's open source OpenVINO Tool set , Only use what we have at hand CPU, Easy implementation for PaddleOCR Real time reasoning .
mention PaddleOCR, I have to mention its technical principle .PaddleOCR The workflow is shown in the figure below , It mainly includes text detection 、 Direction classification 、 And text recognition .

02 reasoning process
Text detection task , It refers to finding the position of text in an image or video , It doesn't need to be like the target detection task , Not only to solve the positioning problem , Also solve the problem of target classification . However , Text detection also faces some difficulties , such as : Texts in natural scenes are diverse , Text size 、 Direction 、 length 、 shape 、 Languages are different , The text overlaps or has a high density , These will affect the effect of final text detection . At present, the commonly used text detection methods include regression based and segmentation based methods . And in the PaddleOCR in , What we chose is based on segmentation DBNet Method .

DBNet How it works is shown in the figure above .
generally speaking , The image will pass through the feature pyramid fpn Network structure obtained 4 A feature map , They are... Of the original image 1/4、1/8、1/16 and 1/32 Size . then , We will 4 Feature maps are up sampled as 1/4 Size , Proceed again concatenation You will get the characteristic map f. Next, the characteristic diagram f Get our probability diagram and threshold diagram . Finally, through the probability diagram p And threshold graph t The calculation of , An approximate binary graph can be obtained by differential binarization .
This method based on segmentation needs to use threshold for binarization, which will lead to a long time after processing , So what we're using is DBNet In view of this phenomenon, a method of learning threshold is proposed , A binarization function approximating the step function is cleverly designed , So that the segmentation network can learn the threshold of text segmentation end-to-end during training . This automatic threshold adjustment can not only improve the accuracy , At the same time, it can simplify post-processing , It will further improve the performance of text detection .
03 Specific code implementation
Talk on paper and you'll never know , We must know that we must do it . Let's import everything we need first Python tool kit , By viewing the operation practice of specific code OpenVINO about PaddleOCR The reasoning of the text detection model .
Before reasoning , First, you need to download the model . To make it easy to download , We define a model download function , Then specify the download path . In this course, we use PaddleOCR mobile Such a small model .

Then read the model , And load it into the device CPU The above to . Just use two lines of very simple code like the following figure , You can read and load the model .

Next , You need to define variable names and preprocessing 、 Post processing functions . We need to input and output such a variable name when carrying out the text detection task . Before we formally carry out the reasoning task of text detection , We also need to define some necessary pre-processing and post-processing functions , Finally, we can visually see and detect the target text on a picture and complete the conversion process .

Last , We only need to use such a line of code , You can use it OpenVINO To infer the text detection model , So what is the effect of reasoning ?

Reasoning effect verification

The figure above shows the reasoning results of text detection in this course , You can see that the text detection and positioning of this printed text is clear and accurate , Accurate positioning . What if it's handwriting ? We can change a picture to have a look .


Handwritten text detection results , Still clear and accurate .

Have you learned this course ?
Anyway Nono already get La !
边栏推荐
- 新媒体内容输出方式-短视频
- Camera and lidar calibration: gazebo simulation livox_ camera_ lidar_ Calibration ---- external parameter calibration calculation and result verification
- 推进云网融合,筑路数字经济:英特尔亮相第五届数字中国建设峰会-云生态大会
- [C language] string reverse order (recursive implementation)
- Three ways for the Internet of things to help cope with climate change
- If we were the developer responsible for repairing the collapse of station B that night
- [BRE]软件构建发布自动化
- A great thinking problem cf1671d insert a progression
- 很棒的一个思维题CF1671D Insert a Progression
- MATLAB | 那些你不得不知道的MATLAB小技巧(三)
猜你喜欢

Introduction to thesis writing | how to write an academic research paper

智能便利店带你解锁未来科技购物体验

Glory launched a number of products at the same time. The price of notebook magicbook V 14 starts from 6199 yuan

What has the metauniverse of more than 30 years brought to us?

Posture recognition and simple behavior recognition based on mediapipe

Mqtt---mqtt.fx client software

新媒体内容输出方式-短视频

30余年的元宇宙,为我们带来了什么?

JS event propagation capture stage bubbling stage onclick addeventlistener

【21天学习挑战赛】K同学啊 邀你参加深度学习研讨班
随机推荐
理解双亲委派模式
Tiktok live broadcast monitoring - round robin 24 hours - live broadcast barrage
If we were the developer responsible for repairing the collapse of station B that night
【Meetup预告】OpenMLDB+OneFlow:链接特征工程到模型训练,加速机器学习模型开发
几行代码轻松实现对于PaddleOCR的实时推理,快来get!
永州清洁级动物实验室建设选址注意事项
渲染问题
R语言使用hexSticker包将ggplot2包可视化的结果转换为六角图(六角贴、六角形贴纸、ggplot2 plot to hex sticker)
很棒的一个思维题CF1671D Insert a Progression
[book club issue 13] packaging format of audio and video files
A great thinking problem cf1671d insert a progression
Glory launched a number of products at the same time. The price of notebook magicbook V 14 starts from 6199 yuan
MATLAB | 那些你不得不知道的MATLAB小技巧(三)
Mqtt---mqtt.fx client software
北欧岗位制博士申请有多难?
C语言实现五子棋游戏
The construction of Yongzhou entry exit inspection laboratory
Description and analysis of main parameters of R language r native plot function and lines function (type, PCH, CEX, lty, LWD, col, xlab, ylab)
Assertion mechanism in test class
2022年中国网络视频市场年度综合分析