Complete portable pipeline for masking of Aadhaar Number adhering to Govt. Privacy Guidelines.

Overview

Aadhaar Number Masking Pipeline

Implementation of a complete pipeline that masks the Aadhaar Number in given images to adhere to Govt. of India's Privacy Guidelines for storage of Aadhaar Card images in digital repository. The following project was carried out as an internship for Muthoot Finance. We make use of the open source packages CRAFT text detector | Paper | Pretrained Model | Github Repo provided by Clova AI Research for OSD and combine a heurestic model with pytesseract OCR for masking.

Rohit Ranjan, Ram Sundaram.

Sample Results

teaser

Versions

The search for the best masking pipeline led us to experiment with several different approaches. We have documented our experiments in other branches.

Branch(->model) Speed/Performance Pipeline
main Best performing CRAFT + pytesseract + dimensional heuristics
CNN_OCR->cnn_model Fastest masking CRAFT + LeNet trained by us
CNN_OCR->cnn_model_2 Fastest masking CRAFT + LeNet trained by us
UNET_OCDR Theoretically Fastest but trained model unavailable** UNet

**We proposed and implemented a pipeline which uses a single UNet model for achieving a desirable mask. A single model would have made the inference very fast and real time use capable on mobile devices. Training meant creating a dataset since the company could not legally provide us the needed data. After several trials, we halted work on this model because with barely 150 unique datapoints available, a data hungry UNet Model is simply unsatiable for now.

Datasets

Lenet was trained on our self-created labelled dataset | labels.

Getting started

Install dependencies

Requirements

  • torch
  • opencv-python
  • tesseract-ocr
  • check requirements.txt
pip install -r requirements.txt

Test instructions

  • Clone this repository
git clone https://github.com/thefurorjuror/Aadhaar_Masker.git
  • Run on an image folder
python [folder path to the cloned repo]/masker.py --test_folder=[folder path to test images] --output_folder=[folder path to output images] --cuda=[True/False]
#Example- When one is inside the cloned repo
python masker.py --test_folder=./images/ --output_folder=./output/ --cuda=True

cuda is set to False by default.

Citation

@inproceedings{baek2019character,
  title={Character Region Awareness for Text Detection},
  author={Baek, Youngmin and Lee, Bado and Han, Dongyoon and Yun, Sangdoo and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9365--9374},
  year={2019}
}

License

Copyright (c) 2021-present Rohit Ranjan & Ram Sundaram.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Twitter for Python!

Tweepy: Twitter for Python! Installation The easiest way to install the latest version from PyPI is by using pip: pip install tweepy You can also use

9.4k Jan 07, 2023
A Telegram Bot to display Codeforces Contest Ranklist

CFRankListBot A bot that displays the top ranks for a Codeforces contest. Participants' Details All the details of a participant is in the utils/__ini

Code IIEST 5 Dec 25, 2021
A demo without πŸš€ science, just simple UTXO spending logic.

Stuck TX Demo Docker container that runs 4 dogecoind to demonstrate "the stuck tx problem". Scenario A wallet sends out 3 transactions to a recipient

Patrick Lodder 2 Nov 16, 2021
Google scholar share - Simple python script to pull Google Scholar data from an author's profile

google_scholar_share Simple python script to pull Google Scholar data from an au

Paul Goldsmith-Pinkham 9 Sep 15, 2022
SongFinder Bot helps you to find song name by recognising via voice note or instagram reels shared link.

SongFinder V1.1 SongFinder to detect songs name by just sending voice note or instagram reels links to your telegram bot. FFMPEG must be installed on

Abhishek Pathak 4 Dec 30, 2022
A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.

A python notification tool used for sending you text messages when certain conditions are met in the game, Neptune's Pride.

Paul Clarke 1 Jan 16, 2022
A Python wrapper for the WooCommerce API.

WooCommerce API - Python Client A Python wrapper for the WooCommerce REST API. Easily interact with the WooCommerce REST API using this library. Insta

WooCommerce 171 Dec 25, 2022
Discord Bot for Genshin Impact Wish Simulating

Genshin Inpact Wish Simulation Discord Bot Bot Links Invite Reddit Official Discord Features Discord embed reaction menu for wishes Simple code scalin

Jeffrey Shum 2 Jan 04, 2023
discord.xp Bot, counts XP for members

discord.xp Bot, counts XP for members. How to setup and run? You must have an mysql database Download libs from the requirements.txt file Configurize

irwing 4 Feb 05, 2022
A discord bot written in python

arch-bot A discord bot written in python prefix: . help: .help Installation Requirements A discord bot token Your user id Python installed. For window

3 Jan 10, 2022
RChecker - Checker for minecraft servers

πŸ”Ž RChecker v1.0 Checker for Minecraft Servers πŸ’» Supported operating systems: βœ…

Pedro Vega 1 Aug 30, 2022
universal messaging & notifications api

Pronounced "boat-shahft" What is botschaft? Botschaft is unified messaging & notifications appliance. Want to text yourself when a long-running task c

Tyler M. Kontra 25 Aug 16, 2022
Improved file host. Change of interface and storage: 15 GB available.

File hosting v2 Improved file host. Change of interface and storage: 15 GB available. This app now uses the Google API to store, view, and delete file

Sarusman 1 Jan 18, 2022
Using twitter lists as your feed

Twitlists A while ago, Twitter changed their timeline to be algorithmically-fed rather than a simple reverse-chronological feed. In particular, they p

Peyton Walters 5 Nov 21, 2022
N3RP (the NFT Rental Protocol) allows users to trustlessly rent out their ERC721-based assets.

N3RP β€’ N3RP - An NFT Rental Protocol (pronounced "nerp") Smart Contracts Passing Tests, Frontend Functional But Is Being Beautified. πŸ›  Introduction T

Grant Stenger 56 Dec 07, 2022
You can connect with Sanila Ranatunga using this botπŸ˜‰πŸ˜‰

Sanila-Ranatunga-s-Assistant-Bot You can connect with Sanila Ranatunga using this bot πŸ˜‰ πŸ˜‰ Reach me on Telegram Sanila's Assistant Bot What is Telegr

Sanila Ranatunga 5 Feb 01, 2022
An API Client package to access the APIs for NBA.com

nba_api An API Client package to access the APIs for NBA.com Development Version: v1.1.9 nba_api is an API Client for www.nba.com. This package is mea

Swar Patel 1.4k Jan 01, 2023
This project is based on discord.py and is meant to be a 'Quick Start Bot' to cut down on the time it takes to write complex discord bots.

This project is based on discord.py and is meant to be a 'Quick Start Bot' to cut down on the time it takes to write complex discord bots.

Alec Ibarra 1 Mar 03, 2022
Crud-python-sqlite: used to manage telephone contacts through python and sqlite

crud-python-sqlite This program is used to manage telephone contacts through python and sqlite. Dependencicas python3 sqlite3 Installation Clone the r

Luis NegrΓ³n 0 Jan 24, 2022
Nflmetrics - Johns Hopkins Spring 2022 Sports Analytics research project about NFL Draft Metrics

nflmetrics GitHub repo for Johns Hopkins Spring 2022 Sports Analytics research p

Anish Kulkarni 4 Feb 24, 2022