An expandable and scalable OCR pipeline

Related tags

Computer Visionnidaba
Overview

Overview

Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable collections of digitized texts.

It offers the following functionality:

  • Grayscale Conversion
  • Binarization utilizing Sauvola adaptive thresholding, Otsu, or ocropus's nlbin algorithm
  • Deskewing
  • Dewarping
  • Integration of tesseract, kraken, and ocropus OCR engines
  • Page segmentation from the aforementioned OCR packages
  • Various postprocessing utilities like spell-checking, merging of multiple results, and ground truth comparison.

As it is designed to use a common storage medium on network attached storage and the celery distributed task queue it scales nicely to multi-machine clusters.

Build

To easiest way to install the latest stable(-ish) nidaba is from PyPi:

$ pip install nidaba

or run:

$ pip install .

in the git repository for the bleeding edge development version.

Some useful tasks have external dependencies. A good start is:

# apt-get install libtesseract3 tesseract-ocr-eng libleptonica-dev liblept

Tests

Per default no dictionaries and OCR models necessary to runs the tests are installed. To download the necessary files run:

$ python setup.py download
$ python setup.py nosetests

Tests for modules that call external programs, at the time only tesseract, ocropus, and kraken, will be skipped if these aren't installed.

Running

First edit (the installed) nidaba.yaml and celery.yaml to fit your needs. Have a look at the docs if you haven't set up a celery-based application before.

Then start up the celery daemon with something like:

$ celery -A nidaba worker

Next jobs can be added to the pipeline using the nidaba executable:

$ nidaba batch -b otsu -l tesseract -o tesseract:eng -- ./input.tiff
Preparing filestore             [✓]
Building batch                  [✓]
951c57e5-f8a0-432d-8d77-8a2e27fff53c

Using the return code the current state of the job can be retrieved:

$ nidaba status 25d79a54-9d4a-4939-acb6-8e168d6dbc7c
PENDING

When the job has been processed the status command will return a list of paths containing the final output:

$ nidaba status 951c57e5-f8a0-432d-8d77-8a2e27fff53c
SUCCESS
14.tif → .../input_img.rgb_to_gray_binarize.otsu_ocr.tesseract_grc.tif.hocr

Documentation

Want to learn more? Read the Docs

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
This repo contains several opencv projects done while learning opencv in python.

opencv-projects-python This repo contains both several opencv projects done while learning opencv by python and opencv learning resources [Basic conce

Fatin Shadab 2 Nov 03, 2022
Recognizing cropped text in natural images.

ASTER: Attentional Scene Text Recognizer with Flexible Rectification ASTER is an accurate scene text recognizer with flexible rectification mechanism.

Baoguang Shi 681 Jan 02, 2023
The code for “Oriented RepPoints for Aerail Object Detection”

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints”, Under review. (arXiv preprint) Introduction Or

WentongLi 207 Dec 24, 2022
Python rubik's cube solver

This program makes a 3D representation of a rubiks cube and solves it step by step.

Pablo QB 4 May 29, 2022
天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 - 第三名解决方案

天池2021"全球人工智能技术创新大赛"【赛道一】:医学影像报告异常检测 比赛链接 个人博客记录 目录结构 ├── final------------------------------------决赛方案PPT ├── preliminary_contest--------------------

19 Aug 17, 2022
Histogram specification using openCV in python .

histogram specification using openCV in python . Have to input miu and sigma to draw gausssian distribution which will be used to map the input image . Example input can be miu = 128 sigma = 30

Tamzid hasan 6 Nov 17, 2021
基于图像识别的开源RPA工具,理论上可以支持所有windows软件和网页的自动化

SimpleRPA 基于图像识别的开源RPA工具,理论上可以支持所有windows软件和网页的自动化 简介 SimpleRPA是一款python语言编写的开源RPA工具(桌面自动控制工具),用户可以通过配置yaml格式的文件,来实现桌面软件的自动化控制,简化繁杂重复的工作,比如运营人员给用户发消息,

Song Hui 7 Jun 26, 2022
利用Paddle框架复现CRAFT

CRAFT-Paddle 利用Paddle框架复现CRAFT CRAFT 本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: CRAFT: Character-Region Awarenes

QuanHao Guo 2 Mar 07, 2022
Generates a message from the infamous Jerma Impostor image

Generate your very own jerma sus imposter message. Modes: Default Mode: Only supports the characters " ", !, a, b, c, d, e, h, i, m, n, o, p, q, r, s,

Giorno420 1 Oct 27, 2022
huoyijie 1.2k Dec 29, 2022
This repository summarized computer vision theories.

This repository summarized computer vision theories.

3 Feb 04, 2022
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which

162 Jan 05, 2023
Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Scene-Text-Detection-with-SPCNET Unofficial repository for [Scene Text Detection with Supervised Pyramid Context Network][https://arxiv.org/abs/1811.0

121 Oct 15, 2021
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 08, 2022
OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

OCR-D 59 Sep 10, 2022
A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Qbr Qbr, pronounced as Cuber, is a webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV. 🌈 Accurate color detection 🔍 Accurate 3x3x

Kim 金可明 502 Dec 29, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
Handwritten_Text_Recognition

Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi

24 Jul 15, 2022