当前位置:网站首页>[opencv learning] small ticket recognition based on perspective transformation and OCR recognition
[opencv learning] small ticket recognition based on perspective transformation and OCR recognition
2022-06-12 23:15:00 【A sea of stars】
This article is based on the perspective transformation learned before 、 and OCR distinguish , Made a simple small ticket identification , as follows :
import cv2
import numpy as np
from PIL import Image
import pytesseract as tess
dsize = (55, 88) # Unified scale
# Show the image , Encapsulate as a function
def cv_show_image(name, img):
cv2.imshow(name, img)
cv2.waitKey(0) # Waiting time , In milliseconds ,0 Represents any key termination
cv2.destroyAllWindows()
# =========================================================
# ================ Read image for preprocessing =========================
# =========================================================
# Read the original color image
ocr_img = cv2.imread('images/ocr_qr_code.PNG')
h_src, w_src, c_src = ocr_img.shape
# Gray value and binary conversion
ocr_img_gray = cv2.cvtColor(ocr_img, cv2.COLOR_BGR2GRAY)
# cv_show_image('template_gray', template_gray)
# Gauss filtering
ocr_img_gray = cv2.GaussianBlur(ocr_img_gray, (3, 3), 1)
# Two valued
ret, ocr_img_thresh = cv2.threshold(ocr_img_gray, 200, 255, cv2.THRESH_BINARY)
cv_show_image('template_thresh', ocr_img_thresh)
# Find all the contours . Just need the outline
ocr_img_contours, hierarchy = cv2.findContours(ocr_img_thresh,
cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
# =========================================================
# ================ Find the outline with the largest area =========================
# =========================================================
# Find the outline with the largest area
draw_img = ocr_img.copy() # Phased test view use
# The last parameter can control to find the first few largest . For example, ha ,0 Represents the largest ,2 It means the top three .
cont_max = sorted(ocr_img_contours, key=cv2.contourArea, reverse=True)[0] # Sort by area , Find the biggest , In reverse order .
# Draw the outline , Red line
x, y, w, h = cv2.boundingRect(cont_max)
draw_img = cv2.drawContours(draw_img, [cont_max], -1, color=(0, 0, 255), thickness=2) # Draw the outline , Will draw on the original picture
arcLength = cv2.arcLength(cont_max, True) # Find the perimeter of the largest contour
# The original outline may be a lot of song points , But we only need the outline of a quadrilateral with four points . Here, contour approximation is required .
# Keep trying to raise the threshold , Increase the approximate range , Reduce the number of edges .
rate = 0.01
approx_max = None
while len(cont_max) != 4:
# epsilon Is the maximum distance from the original contour to the approximate contour , It is also an approximate judgment threshold . closed It means a closed outline
approx_max = cv2.approxPolyDP(cont_max, epsilon=rate * arcLength, closed=True)
if len(approx_max) == 4:
print("rate={}, epsilon={}".format(rate, rate * arcLength))
break
rate += 0.01
print("approx: ", approx_max)
# Draw the outline , Green lines
draw_img = cv2.drawContours(draw_img, [approx_max], -1, color=(0, 255, 0), thickness=2) # Draw the outline , Will draw on the original picture
cv_show_image('rectangle_contours_img', draw_img)
del draw_img
# =========================================================
# ================ We get four vertices , Do perspective transformation =========================
# =========================================================
# Sort the four vertices first , according to (( Top left ),( The upper right ),( The lower right ),( Sit down )) In order to define
# Eventually these four points will turn into ((0,0), (w,0), (w,h), (h,w)) + translation ( Top left ) In the form of .
def sort_dotCnt(kps):
rect = np.zeros((4, 2), dtype='float32')
s = kps.sum(axis=1)
# Find the top left and bottom right
rect[0] = kps[np.argmin(s)]
rect[2] = kps[np.argmax(s)]
# Find the top right and bottom left
diff = np.diff(kps, axis=1)
rect[1] = kps[np.argmin(diff)]
rect[3] = kps[np.argmax(diff)]
return rect
print(approx_max.shape)
print(approx_max.reshape(4, 2))
rect_ordered = sort_dotCnt(approx_max.reshape(4, 2))
(top_left, top_right, bottom_right, bottom_left) = rect_ordered
# Information about the four vertices of the object in the original image
pts_src = np.array([top_left, top_right, bottom_right, bottom_left], dtype="float32")
# Four vertex information of the object in the target object
pts_dst = np.array([(0 + top_left[0], 0 + top_left[1]),
(w + top_left[0], 0 + top_left[1]),
(w + top_left[0], h + top_left[1]),
(0 + top_left[0], h + top_left[1])], dtype="float32")
# It's a 3x3 Matrix , According to the corresponding two points , Calculate the transformation matrix , Thus, the original image is converted .
M = cv2.getPerspectiveTransform(pts_src, pts_dst)
# Based on homography matrix , Convert the original image into the target image
im_out = cv2.warpPerspective(ocr_img_thresh, M, (w_src, h_src))
cv_show_image('im_out', im_out)
# =========================================================
# ================ Identify its number =========================
# =========================================================
textInImage = Image.fromarray(im_out)
text = tess.image_to_string(textInImage)
print("\nocr detect result:%s" % text)
The original drawing is pretreated :
After contour detection , Get an outline of four vertices and draw it with green lines 
After perspective transformation :
Finally used OCR Identified :
At present, the number can be recognized intelligently , The next time , I will go and learn how to recognize simplified Chinese characters
边栏推荐
- LeetCode 146. LRU cache
- Model over fitting - solution (II): dropout
- Modify the text color of the menu on the right of toobar
- MYSQL 行转列、列转行、多列转一行、一行转多列
- ShardingSphere-proxy-5.0.0部署之分表实现(一)
- Photoshop:PS如何实现放大图片不模糊
- Pytorch common parameter initialization methods: [uniform distribution, normal (Gaussian) distribution, Xavier, Kaiming, orthogonal matrix, sparse matrix, constant, identity matrix, zero filling]
- 年薪50万是一条线,年薪100万又是一条线…...
- MOOG servo valve d634-341c/r40ko2m0nss2
- iShot
猜你喜欢

MySQL case when then function use

〖Kubernetes指南④〗Pod快速入门

Record 5 - the serial port of stm32f411ceu6 realizes the sending and receiving of fixed length data and variable length data

2202 - production de CV

MYSQL 行转列、列转行、多列转一行、一行转多列

Qrcodejs2 QR code generation JS

The annual salary of 500000 is one line, and the annual salary of 1million is another line

Web3 principle and decentralization

Zhengzhou University of light industry -- development and sharing of harmonyos pet health system

Chapter 8 - shared model JUC
随机推荐
Go时间格式化 赋值
Hostvars in ansible
项目里面的traceID的设计
InfoQ geek media's 15th anniversary solicitation | brief introduction to the four challenges of building a micro service architecture
深度学习-神经网络:卷积的实现方法【直接法(精度没损失)、GEMM(矩阵乘法,精度没损失)、FFT(傅里叶变换,精度有损失)、Winograd(精度有损失)】
人脸检测:MTCNN
Huawei officially entered the "front loading" stage, and the millimeter wave radar track entered the "localization +4d" cycle
LeetCode 146. LRU cache
[leetcode] the k-largest element in the array
Hongmeng starts
ImageView grayed, reflected, rounded, watermarked
【LeetCode】53. Maximum subarray and
Market trend report, technical innovation and market forecast of Chinese stump crusher
Flutter库推荐Sizer 可帮助您轻松创建响应式 UI
C language: how to give an alias to a global variable?
【LeetCode】209. Minimum length subarray
csredis-in-asp. Net core theory practice - use examples
Database system composition
〖Kubernetes指南⑤〗Label快速入门
The most widely used dynamic routing protocol: OSPF