Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Overview

DewarpNet

This repository contains the codes for DewarpNet training.

Recent Updates

  • [May, 2020] Added evaluation images and an important note about Matlab SSIM.
  • [Dec, 2020] Added OCR evaluation details.

Training

  • Prepare Data: train.txt & val.txt. Contents should be like:
1/824_8-cp_Page_0503-7Ns0001
1/824_1-cp_Page_0504-2Cw0001
  • Train Shape Network: python trainwc.py --arch unetnc --data_path ./data/DewarpNet/doc3d/ --batch_size 50 --tboard
  • Train Texture Mapping Network: python trainbm.py --arch dnetccnl --img_rows 128 --img_cols 128 --img_norm --n_epoch 250 --batch_size 50 --l_rate 0.0001 --tboard --data_path ./DewarpNet/doc3d

Inference:

  • Run: python infer.py --wc_model_path ./eval/models/unetnc_doc3d.pkl --bm_model_path ./eval/models/dnetccnl_doc3d.pkl --show

Evaluation (Image Metrics):

  • We use the same evaluation code as DocUNet. To reproduce the quantitative results reported in the paper use the images available here.

  • [Important note about Matlab version] We noticed that Matlab 2020a uses a different SSIM implementation which gives a better MS-SSIM score (0.5623). Whereas we have used Matlab 2018b. Please compare the scores according to your Matlab version.

Evaluation (OCR Metrics):

  • The 25 images used for OCR evaluation is /eval/ocr_eval/ocr_files.txt
  • The corresponding ground-truth text is given in /eval/ocr_eval/tess_gt.json
  • For the OCR errors reported in the paper we had used cv2.blur as pre-processing which gives higher error in all the cases. For convenience, we provide the updated numbers (without using blur) in the following table:
Method ED CER ED (no blur) CER (no blur)
DocUNet 1975.86 0.4656(0.263) 1671.80 0.403 (0.256)
DocUNet on Doc3D 1684.34 0.3955 (0.272) 1296.00 0.294 (0.235)
DewarpNet 1288.60 0.3136 (0.248) 1007.28 0.249 (0.236)
DewarpNet (ref) 1114.40 0.2692 (0.234) 812.48 0.204 (0.228)
  • We had used the Tesseract (v4.1.0) default configuration for evaluation with PyTesseract (v0.2.6).

Models:

  • Pre-trained models are available here. These models are captured prior to end-to-end training, thus won't give you the end-to-end results reported in Table 2 of the paper. Use the images provided above to get the exact numbers as Table 2.

Dataset:

  • The doc3D dataset can be downloaded using the scripts here.

More Stuff:

Citation:

If you use the dataset or this code, please consider citing our work-

@inproceedings{SagnikKeICCV2019, 
Author = {Sagnik Das*, Ke Ma*, Zhixin Shu, Dimitris Samaras, Roy Shilkrot}, 
Booktitle = {Proceedings of International Conference on Computer Vision}, 
Title = {DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks}, 
Year = {2019}}   

Acknowledgements:

Owner
[email protected]
Computer Vision Lab at Stony Brook University
<a href=[email protected]">
Document manipulation detection with python

image manipulation detection task: -- tianchi function image segmentation salie

JiaKui Hu 3 Aug 22, 2022
7th place solution

SIIM-FISABIO-RSNA-COVID-19-Detection 7th place solution Validation: We used iterative-stratification with 5 folds (https://github.com/trent-b/iterativ

11 Jul 17, 2022
Simple SDF mesh generation in Python

Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.

Michael Fogleman 1.1k Jan 08, 2023
Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

Satoshi ~ DiscordCryptoBot Satoshi is a simple python discord bot using discord.py that allow you to track your favorites cryptos prices with your own

Théo 2 Sep 15, 2022
A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Home-Security-Demo A facial recognition program that plays a alarm (mp3 file) when a person is seen in the room. A basic theif using Python and OpenCV

SysKey 4 Nov 02, 2021
This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images.

Welcome This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images. Installation There are curren

8 Jul 29, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
A tensorflow implementation of EAST text detector

EAST: An Efficient and Accurate Scene Text Detector Introduction This is a tensorflow re-implementation of EAST: An Efficient and Accurate Scene Text

2.9k Jan 02, 2023
A simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dropbox account at every 5 seconds

Security Camera using Opencv & Dropbox This is a simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dro

Arpit Rath 1 Jan 31, 2022
Generate a list of papers with publicly available source code in the daily arxiv

2021-06-08 paper code optimal network slicing for service-oriented networks with flexible routing and guaranteed e2e latency networkslicing multi-moda

79 Jan 03, 2023
Line based ATR Engine based on OCRopy

OCR Engine based on OCRopy and Kraken using python3. It is designed to both be easy to use from the command line but also be modular to be integrated

948 Dec 23, 2022
POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Python Optimal Transport 1.7k Jan 04, 2023
The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Mask TextSpotter A Pytorch implementation of Mask TextSpotter along with its extension can be find here Introduction This is the official implementati

Pengyuan Lyu 261 Nov 21, 2022
OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 144 Jan 05, 2023
Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

Sparsha Saha 17 Aug 25, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

Scene Text Recognition Recommendations Everythin about Scene Text Recognition SOTA • Papers • Datasets • Code Contents 1. Papers 2. Datasets 2.1 Synth

Deep Learning and Vision Computing Lab, SCUT 197 Jan 05, 2023
A python screen recorder for low-end computers, provides high quality video output.

RecorderX - v1.0 A screen recorder made in Python with the help of OpenCv, it has ability to record your screen in high quality. No matter what your P

Priyanshu Jindal 4 Nov 10, 2021
Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

CAI Junhao 7 Oct 03, 2022
SRA's seminar on Introduction to Computer Vision Fundamentals

Introduction to Computer Vision This repository includes basics to : Python Numpy: A python library Git Computer Vision. The aim of this repository is

Society of Robotics and Automation 147 Dec 04, 2022