当前位置:网站首页>Python image recognition OCR
Python image recognition OCR
2020-11-07 20:56:00 【Coxhuang】
List of articles
- Python Image recognition OCR
- #1 demand
- #2 Environmental Science
- #3 install
- #3.1 macOS
- #3.2 Linux(CentOS)
- #4 Use
- #4.1 python install pytesseract library
- #4.2 Python Code
- #5 Online case
Python Image recognition OCR
#1 demand
- Identify the information in the picture , Such as QR code
#2 Environmental Science
macOS / Linux Python3.7.6
#3 install
#3.1 macOS
- install tesseract
// Install only tesseract, Don't install training tools brew install tesseract // install tesseract At the same time install training tools brew install --with-training-tools tesseract // install tesseract Install all languages at the same time , The language pack is bigger , If installed, it will take a long time , It is not recommended to install , Select on demand brew install --all-languages tesseract // install tesseract, And install training tools and language brew install --all-languages --with-training-tools tesseract
2. Download the language pack
Address : https://github.com/tesseract-ocr/tessdata
I have installed a Chinese language pack here
Chinese language pack : https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata
Then copy the downloaded Chinese language pack to the following path :
/usr/local/Cellar/tesseract/4.0.0_1/share/tessdata
3. Check out the local language pack
tesseract --list-langs
#3.2 Linux(CentOS)
- Installation dependency
yum install autoconf automake libtool libjpeg-devel libpng-devel libtiff-devel zlib-devel
2. install leptonica
download : wget https://github.com/tesseract-ocr/tesseract/archive/4.1.0.tar.gz
Unpack the installation
tar -xzvf leptonica-1.74.4.tar.gz cd leptonica-1.74.4.tar.gz ./configure --profix=/usr/local/leptonica make sudo make install
3. install tesseract-ocr
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.zip unzip 3.04.zip cd tesseract-3.04/ ./configure make && make install sudo ldconfig
I have installed a Chinese language pack here
Chinese language pack : https://github.com/tesseract-ocr/tessdata/blob/master/chi_sim.traineddata
Then copy the downloaded Chinese language pack to the following path :
/usr/local/share/tessdata
#4 Use
#4.1 python install pytesseract library
pip install pytesseract pip install Pillow
#4.2 Python Code
from PIL import Image
import pytesseract
# Specify the image path and identify the language
data = pytesseract.image_to_string(Image.open('/Users/Documents/1.png'), lang='chi_sim')
print(data)
#5 Online case
Address :
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .
版权声明
本文为[Coxhuang]所创,转载请带上原文链接,感谢
边栏推荐
猜你喜欢

More than 50 object detection datasets from different industries

获取树形菜单列表

Analysis of kubernetes service types: from concept to practice

30岁后,你还剩下什么?

Git code submission operation, and git push prompt failed to push some refs'xxx '

Insight -- the application of sanet in arbitrary style transfer

一文详解微服务架构

Big data algorithm - bloon filter

洞察——风格注意力网络(SANet)在任意风格迁移中的应用

在pandas中使用pipe()提升代码可读性
随机推荐
年薪90万程序员不如月入3800公务员?安稳与高收入,到底如何选择?
Data transmission of asynchronous serial communication controlled by group bus communication
In the age of screen reading, we suffer from attention deficit syndrome
Do not understand the underlying principle of database index? That's because you don't have a B tree in your heart
Cpp(三) 什么是CMake
Insight -- the application of sanet in arbitrary style transfer
How to think in the way of computer
你可能不知道的Animation动画技巧与细节
Web安全(三)---CSRF攻击
Ubuntu下搜狗输入法的下载安装及配置
在pandas中使用pipe()提升代码可读性
手撕算法-手写单例模式
GrowingIO 响应式编程探索和实践
There's not much time left for Kwai Chung.
Insight -- the application of sanet in arbitrary style transfer
当 TiDB 与 Flink 相结合:高效、易用的实时数仓
Web安全(二)---跨域资源共享
虚拟DOM中给同一层级的元素设置固定且唯一的key为什么能提高性能
使用 Xunit.DependencyInjection 改造测试项目
「混合云」会是云计算的下一个战场吗?