Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Last update: Dec 28, 2022

Overview

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop)

(Pronounced as "strog")

Paper

Arxiv

Why it matters?

Scene Text Recognition (STR) requires data augmentation functions that are different from object recognition. STRAug is data augmentation designed for STR. It offers 36 data augmentation functions that are sorted into 8 groups. Each function supports 3 levels or magnitudes of severity or intensity.

Given a source image:

it can be transformed as follows:

warp.py - to generate Curve, Distort, Stretch (or Elastic) deformations

`Curve`	`Distort`	`Stretch`

geometry.py - to generate Perspective, Rotation, Shrink deformations

`Perspective`	`Rotation`	`Shrink`

pattern.py - to create different grids: Grid, VGrid, HGrid, RectGrid, EllipseGrid

`Grid`	`VGrid`	`HGrid`	`RectGrid`	`EllipseGrid`

blur.py - to generate synthetic blur: GaussianBlur, DefocusBlur, MotionBlur, GlassBlur, ZoomBlur

`GaussianBlur`	`DefocusBlur`	`MotionBlur`	`GlassBlur`	`ZoomBlur`

noise.py - to add noise: GaussianNoise, ShotNoise, ImpulseNoise, SpeckleNoise

`GaussianNoise`	`ShotNoise`	`ImpulseNoise`	`SpeckleNoise`

weather.py - to simulate certain weather conditions: Fog, Snow, Frost, Rain, Shadow

`Fog`	`Snow`	`Frost`	`Rain`	`Shadow`

camera.py - to simulate camera sensor tuning and image compression/resizing: Contrast, Brightness, JpegCompression, Pixelate

`Contrast`	`Brightness`	`JpegCompression`	`Pixelate`

process.py - all other image processing issues: Posterize, Solarize, Invert, Equalize, AutoContrast, Sharpness, Color

`Posterize`	`Solarize`	`Invert`	`Equalize`

`AutoContrast`	`Sharpness`	`Color`

Pip install

pip3 install straug

How to use

Command line (e.g. input image is nokia.png):

>>> from straug.warp import Curve
>>> from PIL import Image
>>> img = Image.open("nokia.png")
>>> img = Curve()(img, mag=3)
>>> img.save("curved_nokia.png")

Python script (see test.py):

python3 test.py --image=<target image>

For example:

python3 test.py --image=images/telekom.png

The corrupted images are in results directory.

Reference

Image corruptions (eg blur, noise, camera effects, fog, frost, etc) are based on the work of Hendrycks et al.

Citation

If you find this work useful, please cite:

@inproceedings{atienza2021data,
  title={Data Augmentation for Scene Text Recognition},
  author={Atienza, Rowel},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
  year={2021},
  pubstate={published},
  tppubtype={inproceedings}
}

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Related tags

Overview

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop)

Paper

Why it matters?

Pip install

How to use

Reference

Citation

Owner

Rowel Atienza

10x faster matrix and vector operations

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

Flower - A Friendly Federated Learning Framework

SOLOv2 on onnx & tensorRT

VOneNet: CNNs with a Primary Visual Cortex Front-End

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

Predictive Modeling on Electronic Health Records(EHR) using Pytorch

最新版本yolov5+deepsort目标检测和追踪，支持5.0版本可训练自己数据集

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.

Distinguishing Commercial from Editorial Content in News

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Codecov coverage standard for Python

A style-based Quantum Generative Adversarial Network

High-quality implementations of standard and SOTA methods on a variety of tasks.

Awesome Transformers in Medical Imaging

A Novel Plug-in Module for Fine-grained Visual Classification