当前位置:网站首页>有了这款工具,自动化识别验证码再也不是问题
有了这款工具,自动化识别验证码再也不是问题
2022-06-29 01:06:00 【小梧敲代码】
环境准备
1、windows 环境下载 exe
http://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-4.00.00dev.exe
双击 exe,一路 next 完成 Tesseract-OCR 安装


2、配置环境变量
PATH 增加 D:\ProgramFiles\Tesseract-OCR
新建环境变量 TESSDATA_PREFIX 值为
D:\ProgramFiles\Tesseract-OCR\tessdata
这是将语言字库文件夹添加到环境变量 TESSDATA_PREFIX 中
CMD 命令行窗口输入如下命令:
查看版本号
C:\Users\18611>tesseract -v
tesseract 4.00.00alpha
leptonica-1.74.1
libgif 4.1.6(?) : libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.20: libtiff 4.0.6 : zlib 1.2.8 :
libwebp 0.4.3 : libopenjp2 2.1.0
查看支持的语言包
C:\Users\18611>tesseract --list-langs
List of available languages (2):
eng
osd
C:\Users\18611>
命令识别图片
识别如下图片验证码

使用 tesseract 命令识别图片中的内容
C:\Users\18611>cd Desktop
C:\Users\18611\Desktop>tesseract test2.png output
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica
C:\Users\18611\Desktop>
【语法】:tesseract imagename outputbase [-l lang] [-psm pagesegmode] [configfile…]
imagename 为目标图片文件名,需加格式后缀;
outputbase 是转换结果文件名;
lang 是语言名称(在 Tesseract-OCR 中 tessdata 文件夹可看到以 eng 开头的语言文件 eng.traineddata),如不标-l eng 则默认为 eng。
java自动识别图片
将 tesseract.exe 命令保存为 bat 文件,bat 内容为:
//图片路径 D:\Tesseract-OCR\test.png 生成 txt 文件存放路径及文件名 result
代码实现如下:
package com.mtx.util;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStreamReader;
/** * @ClassName ReadCpacha * @Description TODO * @Author 彩虹 rainbow QQ3130978832 * @Date-Time 2022/6/9 13:55 * @ProjectName MtxPublic * @Copyright 北京码同学网络科技有限公司 **/
public class ReadCpacha{
public static String readPic(){
String cmd= "cmd /c start D:\\Tesseract-OCR\\tesseract.bat";
try {
Runtime.getRuntime().exec(cmd);
} catch(Exception e) {
e.printStackTrace();
}
try {
//线程阻塞 3 秒等待 tesseract.exe 执行完成
Thread.sleep(3000);
}catch (InterruptedException e) {
e.printStackTrace();
}
//执行 tesseract.exe 识别图片后生成 result.txt 文件中保存识别后验证码
//读取 result.txt 文件获取验证码
// ReadTxt
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
StringBuffer sb= new StringBuffer();
String text = null;
while((text = bufferedReader.readLine()) != null){
//逐行读取到的字符串存到 StringBuffer 对象
sb.append(text);
}
return sb.toString();
}catch (Exception e) {
e.printStackTrace();
}
}
return null;
}
public static void main(String[] args) {
String str = readPic();//调用封装方法测试
System.out.println(str);
}
}
C:\Users\18611\IdeaProjects\MtxPublic>tesseract --help-psm
Page segmentation modes:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.
C:\Users\18611\IdeaProjects\MtxPublic>
最后感谢每一个认真阅读我文章的人,下面这个网盘链接也是我费了几天时间整理的非常全面的,希望也能帮助到有需要的你!

这些资料,对于想转行做【软件测试】的朋友来说应该是最全面最完整的备战仓库,这个仓库也陪伴我走过了最艰难的路程,希望也能帮助到你!凡事要趁早,特别是技术行业,一定要提升技术功底。希望对大家有所帮助……
如果你不想一个人野蛮生长,找不到系统的资料,问题得不到帮助,坚持几天便放弃的感受的话,可以点击下方小卡片加入我们群,大家可以一起讨论交流,里面会有各种软件测试资料和技术交流。
| 点击文末小卡片领取 |
敲字不易,如果此文章对你有帮助的话,点个赞收个藏来个关注,给作者一个鼓励。也方便你下次能够快速查找。
自学推荐B站视频:
零基础转行软件测试:25天从零基础转行到入职软件测试岗,今天学完,明天就业。【包括功能/接口/自动化/python自动化测试/性能/测试开发】
边栏推荐
- 学习通否认 QQ 号被盗与其有关:已报案;iPhone 14 量产工作就绪:四款齐发;简洁优雅的软件早已是明日黄花|极客头条
- Is it safe to open a securities account at qiniu business school in 2022?
- Blazor University (34)表单 —— 获得表单状态
- 独家分析 | 关于简历和面试
- Depth first search to realize the problem of catching cattle
- 不同的子序列问题I
- IT治理方面的七个错误,以及如何避免
- 【火灾检测】基于matlab GUI森林火灾检测系统(带面板)【含Matlab源码 1921期】
- Uvm:field automation mechanism
- [staff] pedal mark (step on pedal ped mark | release pedal * mark | corresponding pedal command in MIDI | continuous control signal | switch control signal)
猜你喜欢

SRAM和DRAM之间的异同

Reasons for high price of optical fiber slip ring

学习通否认 QQ 号被盗与其有关:已报案;iPhone 14 量产工作就绪:四款齐发;简洁优雅的软件早已是明日黄花|极客头条

Xuetong denies that the theft of QQ number is related to it: it has been reported; IPhone 14 is ready for mass production: four models are launched simultaneously; Simple and elegant software has long

Successfully solved (machine learning data segmentation problem): modulenotfounderror: no module named 'sklearn cross_ validation‘

What is the difference between the history and Western blotting
![[Fire Detection] forest fire detection system based on Matlab GUI (with panel) [including Matlab source code phase 1921]](/img/fa/17731d46a70112faadbfd0e61be143.png)
[Fire Detection] forest fire detection system based on Matlab GUI (with panel) [including Matlab source code phase 1921]

Précautions d'installation et d'utilisation des joints rotatifs

统计学习方法(2/22)感知机

BMFONT制作位图字体并在CocosCreator中使用
随机推荐
Cocoscrreator dynamically switches skeletondata to update bones
FSS object storage how to access the Intranet
Jbridge bridging frame technology for AI computing power landing
What is the reason for the service crash caused by replacing the easycvr cluster version with the old database?
Redis common command manual
2022_ 2_ 16 the second day of learning C language_ Constant, variable
EasyCVR集群版本替换成老数据库造成的服务崩溃是什么原因?
Daily English articles, reading accumulation
[staff] pedal mark (step on pedal ped mark | release pedal * mark | corresponding pedal command in MIDI | continuous control signal | switch control signal)
同期群分析是什么?教你用 SQL 来搞定
浏览器缓存库设计总结(localStorage/indexedDB)
Mapbox GL loading local publishing DEM data
使用.Net驱动Jetson Nano的OLED显示屏
Redis常用命令手册
【温度检测】基于matlab GUI热红外图像温度检测系统【含Matlab源码 1920期】
How to solve the problem of Caton screen when easycvr plays video?
Bmfont make bitmap font and use it in cocoscreator
深度优先搜索实现抓牛问题
统计学习方法(4/22)朴素贝叶斯
Precautions for installation and use of rotary joint