当前位置:网站首页>Is it amazing to extract text from pictures? Try three steps to realize OCR!
Is it amazing to extract text from pictures? Try three steps to realize OCR!
2022-07-28 00:27:00 【Intel edge computing community】

About OCR
OCR(Optical Character Recognition, Optical character recognition ) In short, it is a kind of image 、 A system that scans documents or converts text in natural scenes captured by cameras into digital machine encoded text , Through extraction and transformation, it is more convenient to store and search these text information digitally , Reduce input time , And reduce manual search 、 The pain of checking . The bank account number mentioned at the beginning of this article 、 Automatic extraction and input of express address information , Is its typical application .

With the rapid development of deep learning technology , The deep learning technology based on neural network is used to realize OCR It has stronger robustness 、 More accurate 、 Easy to use and other features .
So next we will focus on how to use deep learning technology to achieve OCR. I hope that through today's content introduction , You can roughly understand what is OCR, And how to use your personal computer or notebook CPU, Fast implementation OCR The reasoning of , Realize handwritten digit recognition based on deep learning .
Use deep learning technology to achieve OCR
In this course , We will provide a deep learning model to realize OCR Simple Demo, And use Intel open source tool suite OpenVINO To optimize and accelerate the performance of this model . Just use the source code provided in this course , And learn the next three simple steps , You can use one very conveniently Jupyter Notebook Page based implementation MNIST Handwritten numeral recognition of such open source handwritten numeral data set .

stay Demo in , Considering that we need to judge each number of pictures and make sure that the handwritten digits in each picture are numbers 0~9 this 10 What kind of class , So here OCR Demo We need to build and train a neural network model that can realize image classification , This model can be used for 0~9 this 10 The probability that each of the categories returns a category , Then the number represented by the category with the greatest probability will be determined as the handwritten number finally recognized in our picture .
Three steps of course operation
Next, let's take a look at the specific code .

Step one : Building environment . We need to install OpenVINO Development kit and corresponding Python tool kit .

Step two : Build and train the neural network model . Here we only need a few lines of code like the above figure , Can build a simple neural network model .

Besides , We also need to define the output of the final model , We can add a layer as shown in the figure above softmax Layer to get the probability of each category , Selecting the number represented by the category with the greatest probability will be the final result of image recognition and model training . Next , You can see that the whole model training is already running , The running speed should not be underestimated .
Step three : utilize OpenVINO Provided model optimizer ,Model Optimizer (mo) Optimize the whole neural network model . The whole optimization process runs , Here's the picture .

stay mo After the run is over , We will get the model file saved in intermediate format , Namely xml Document and bin file , These two models save the model structure of the file and the weight of the model .
Next , We can use it OpenVINO To reason , The reasoning code is also quite simple and convenient , Whole OCR Implementation of the complete code , You can refer to here (https://www.kaggle.com/code/raymondlo84/mnist-with-openvino-and-tensorflow-on-kaggle) To download .
Last , Let's take a look at the effect of the whole model ! Let's print some tests MNIST Handwritten digit recognition results on open source datasets .

You can see that the recognition effect is quite amazing , The accuracy can even reach 99% above , What are you waiting for ? Come on, according to the source code we provide , Facing the course video and Nono Try it together !

Full code download address :https://www.kaggle.com/code/raymondlo84/mnist-with-openvino-and-tensorflow-on-kaggle
边栏推荐
- Overview of construction site selection of Yongzhou analytical laboratory
- ҈直҈播҈预҈告҈ |҈ 炎热盛夏,与Nono一起跨越高温“烤”验吧!
- [BRE]软件构建发布自动化
- 北欧岗位制博士申请有多难?
- HarmonyOS 3纯净模式可限制华为应用市场检出的风险应用获取个人数据
- "Digital economy, science and technology for the good" talk about dry goods
- How to realize fast recognition of oversized images
- 数据中台的那些“经验与陷阱”
- 永州清洁级动物实验室建设选址注意事项
- 【Objective-C语言的SEL对象】
猜你喜欢

What a beautiful rainbow

7月第3周榜单丨飞瓜数据B站UP主排行榜发布!

【打新必读】魅视科技估值分析,分布式视听产品及解决方案

这种动态规划你见过吗——状态机动态规划之股票问题(中)

Legendary Internet Setup tutorial with graphic explanation - GOM engine

JS 事件传播 捕获阶段 冒泡阶段 onclick addEventListener

MATLAB | MATLAB地形生成:矩形迭代法 · 傅里叶逆变换法 · 分形柏林噪声法

New media content output method - short video

泵站远程监控

传奇外网架设教程带图文解说——Gom引擎
随机推荐
Oracle密码过期解决办法
Oracle password expiration solution
[sel object of Objective-C language]
泵站远程监控
Mqtt---mqtt.fx client software
学yolo需要什么基础?怎么学YOLO?
物联网有助于应对气候变化的 3 种方式
ESP8266-----MQTT云下设备上云
千万播放竟有通用公式?B站被小看的爆款机会!
数据中台的那些“经验与陷阱”
Diffusion + super-resolution model strong combination, the technology behind Google image generator image
[Development Tutorial 9] crazy shell · open source Bluetooth heart rate waterproof sports Bracelet - heart rate monitoring
[21 day learning challenge] classmate K invites you to participate in the in-depth learning seminar
Leetcode 452. minimum number of arrows to burst balloons (medium)
ADB path cannot contain 2 spaces remote could n't create file: is a directory
英特尔AI实践日第56期 | 探讨行业发展新趋势
A great thinking problem cf1671d insert a progression
How difficult is it to apply for a doctorate under the post system in northern Europe?
每次读取一行字符串输入(有待补充)
What are the software operation and maintenance monitoring?