Tightness-aware Evaluation Protocol for Scene Text Detection

Overview

TIoU-metric

Release on 27/03/2019. This repository is built on the ICDAR 2015 evaluation code.

State-of-the-art Results on Total-Text and CTW1500 (TIoU)

We sincerely appreciate the authors of recent and previous state-of-the-art methods for providing their results for evaluating TIoU metric in curved text benchmarks. The results are listed below:

Total-Text

Methods on Total-Text TIoU-Recall (%) TIoU-Precision (%) TIoU-Hmean (%) Publication
LSN+CC [paper] 48.4 59.8 53.5 arXiv 1903
Polygon-FRCNN-3 [paper] 47.9 61.9 54.0 IJDAR 2019
CTD+TLOC [paper][code] 50.8 62.0 55.8 arXiv 1712
ATRR [paper] 53.7 63.5 58.2 CVPR 2019
PSENet [paper][code] 53.3 66.9 59.3 CVPR 2019
CRAFT [paper] 54.1 65.5 59.3 CVPR 2019
TextField [paper] 58.0 63.0 60.4 TIP 2019
Mask TextSpotter [paper] 54.5 68.0 60.5 ECCV 2018
SPCNet [paper][code] 61.8 69.4 65.4 AAAI 2019

CTW1500

Methods on CTW1500 TIoU-Recall (%) TIoU-Precision (%) TIoU-Hmean (%) Publication
CTD+TLOC [paper][code] 42.5 53.9 47.5 arXiv 1712
ATRR [paper] 54.9 61.6 58.0 CVPR 2019
LSN+CC [paper] 55.9 64.8 60.0 arXiv 1903
PSENet [paper][code] 54.9 67.6 60.6 CVPR 2019
CRAFT [paper] 56.4 66.3 61.0 CVPR 2019
MSR [paper] 56.3 67.3 61.3 arXiv 1901
TextField [paper] 57.2 66.2 61.4 TIP 2019
TextMountain [paper] 60.7 68.1 64.2 arXiv 1811
PAN Mask R-CNN [paper] 61.0 70.0 65.2 WACV 2019

Description

Evaluation protocols plays key role in the developmental progress of text detection methods. There are strict requirements to ensure that the evaluation methods are fair, objective and reasonable. However, existing metrics exhibit some obvious drawbacks:

*Unreasonable cases obtained using recent evaluation metrics. (a), (b), (c), and (d) all have the same IoU of 0.66 against the GT. Red: GT. Blue: detection.
  • As shown in (a), previous metrics consider that the GT has been entirely recalled.

  • As shown in (b), (c), and (d), even if containing background noise, previous metrics consider such detection to have 100% precision.

  • Previous metrics consider detections (a), (b), (c), and (d) to be equivalent perfect detections.

  • Previous metrics severely rely on an IoU threshold. High IoU threshold may discard some satisfactory bounding boxes, while low IoU threshold may include several inexact bounding boxes.

To address many existing issues of previous evaluation metrics, we propose an improved evaluation protocol called Tightnessaware Intersect-over-Union (TIoU) metric that could quantify:

  • Completeness of ground truth

  • Compactness of detection

  • Tightness of matching degree

We hope this work can raise the attentions of the text detection evaluation metrics and serve as a modest spur to more valuable contributions. More details can be found on our paper.

Clone the TIoU repository

Clone the TIoU-metric repository

git clone https://github.com/Yuliang-Liu/TIoU-metric.git --recursive

Getting Started

Install required module

pip install Polygon2

Then run

python script.py -g=gt.zip -s=pixellinkch4.zip

After that you can see the evaluation resutls.

You can simply replace pixellinkch4.zip with your own dection results, and make sure your dection format follows the same as ICDAR 2015.

Joint Word&Text-Line Evaluation

To test your detection with our joint Word&Text-Line solution, simply

cd Word_Text-Line

Then run the code

python script.py -g=gt.zip -gl=gt_textline.zip -s=pixellinkch4.zip

Support Curved Text Evaluation

Curved text requires polygonal input with mutable number of points. To evaluate your results on recent curved text benchmarks Total-text or SCUT-CTW1500, you can refer to curved-tiou/readme.md.

Example Results

Qualitative results:

*Qualitative visualization of TIoU metric. Blue: Detection. Bold red: Target GT region. Light red: Other GT regions. Rec.: Recognition results by CRNN [24]. NED: Normalized edit distance. Previous metrics evaluate all detection results and target GTs as 100% precision and recall, respectively, while in TIoU metric, all matching pairs are penalized by different degrees. Ct is defined in Eq. 10. Ot is defined in Eq. 13. Please refer to our paper for all the references.

ICDAR 2013 results:

*Comparison of evaluation methods on ICDAR 2013 for general detection frameworks and previous state-of-the-art methods. det: DetEval. i: IoU. e1: End-to-end recognition results by using CRNN [24]. e2: End-to-end recognition results by using RARE [25]. t: TIoU.

Line chart:

*(a) X-axis represents the detection methods listed in the Table above, and Y-axis represents the values of the F-measures.

ICDAR 2015 results:

*Comparison of metrics on the ICDAR 2015 challenge 4. Word&Text-Line Annotations use our new solution to address OM and MO issues. i: IoU. s: SIoU. t: TIoU.

Citation

If you find our metric useful for your reserach, please cite

@article{liu2019tightness,
  title={Tightness-aware Evaluation Protocol for Scene Text Detection},
  author={Liu, Yuliang and Jin, Lianwen and Xie, Zecheng and Luo, Canjie and Zhang, Shuaitao and Xie, Lele},
  journal={CVPR},
  year={2019}
}

References

If you are insterested in developing better scene text detection metrics, some references recommended here might be useful.

[1] Wolf, Christian, and Jean-Michel Jolion. "Object count/area graphs for the evaluation of object detection and segmentation algorithms." International Journal of Document Analysis and Recognition (IJDAR) 8.4 (2006): 280-296.

[2] Calarasanu, Stefania, Jonathan Fabrizio, and Severine Dubuisson. "What is a good evaluation protocol for text localization systems? Concerns, arguments, comparisons and solutions." Image and Vision Computing 46 (2016): 1-17.

[3] Dangla, Aliona, et al. "A first step toward a fair comparison of evaluation protocols for text detection algorithms." 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 2018.

[4] Shi, Baoguang, et al. "ICDAR2017 competition on reading chinese text in the wild (RCTW-17)." 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE, 2017.

Feedback

Suggestions and opinions of this metric (both positive and negative) are greatly welcome. Please contact the authors by sending email to [email protected] or [email protected].

Owner
Yuliang Liu
MMLab; South China University of Technology; University of Adelaide
Yuliang Liu
Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Implementing Graph Convolutional Networks and Information Retrieval Mechanisms using pure Python and NumPy

Noah Getz 3 Jun 22, 2022
The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

Six_God 2.6k Jan 08, 2023
Serving PyTorch 1.0 Models as a Web Server in C++

Serving PyTorch Models in C++ This repository contains various examples to perform inference using PyTorch C++ API. Run git clone https://github.com/W

Onur Kaplan 223 Jan 04, 2023
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Rafael Berral Soler 71 Jan 05, 2023
Deep Q-Learning Network in pytorch (not actively maintained)

pytoch-dqn This project is pytorch implementation of Human-level control through deep reinforcement learning and I also plan to implement the followin

Hung-Tu Chen 342 Jan 01, 2023
A pytorch &keras implementation and demo of Fastformer.

Fastformer Notes from the authors Pytorch/Keras implementation of Fastformer. The keras version only includes the core fastformer attention part. The

153 Dec 28, 2022
Using modified BiSeNet for face parsing in PyTorch

face-parsing.PyTorch Contents Training Demo References Training Prepare training data: -- download CelebAMask-HQ dataset -- change file path in the pr

zll 1.6k Jan 08, 2023
Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

ASFormer: Transformer for Action Segmentation This repo provides training & inference code for BMVC 2021 paper: ASFormer: Transformer for Action Segme

42 Dec 23, 2022
Aerial Imagery dataset for fire detection: classification and segmentation (Unmanned Aerial Vehicle (UAV))

Aerial Imagery dataset for fire detection: classification and segmentation using Unmanned Aerial Vehicle (UAV) Title FLAME (Fire Luminosity Airborne-b

79 Jan 06, 2023
Official implementation for “Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior”

Unsupervised Low-Light Image Enhancement via Histogram Equalization Prior. The code will release soon. Implementation Python3 PyTorch=1.0 NVIDIA GPU+

FengZhang 34 Dec 04, 2022
Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure

Geometric Vector Perceptron Implementation of equivariant GVP-GNNs as described in Learning from Protein Structure with Geometric Vector Perceptrons b

Dror Lab 142 Dec 29, 2022
This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation This repo is the official implementation of "DeciWatch: A Simple Baseline for

117 Dec 24, 2022
A simplistic and efficient pure-python neural network library from Phys Whiz with CPU and GPU support.

A simplistic and efficient pure-python neural network library from Phys Whiz with CPU and GPU support.

Manas Sharma 19 Feb 28, 2022
Codeflare - Scale complex AI/ML pipelines anywhere

Scale complex AI/ML pipelines anywhere CodeFlare is a framework to simplify the integration, scaling and acceleration of complex multi-step analytics

CodeFlare 169 Nov 29, 2022
This tool uses Deep Learning to help you draw and write with your hand and webcam.

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

lmagne 169 Dec 10, 2022
code for "Self-supervised edge features for improved Graph Neural Network training",

Self-supervised edge features for improved Graph Neural Network training Data availability: Here is a link to the raw data for the organoids dataset.

Neal Ravindra 23 Dec 02, 2022
A collection of Jupyter notebooks to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

StyleGAN3 CLIP-based guidance StyleGAN3 + CLIP StyleGAN3 + inversion + CLIP This repo is a collection of Jupyter notebooks made to easily play with St

Eugenio Herrera 176 Dec 30, 2022
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Thomas Neumann 117 Nov 27, 2022
Rethinking Nearest Neighbors for Visual Classification

Rethinking Nearest Neighbors for Visual Classification arXiv Environment settings Check out scripts/env_setup.sh Setup data Download the following fin

Menglin Jia 29 Oct 11, 2022
Framework to build and train RL algorithms

RayLink RayLink is a RL framework used to build and train RL algorithms. RayLink was used to build a RL framework, and tested in a large-scale multi-a

Bytedance Inc. 32 Oct 07, 2022