A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Overview

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

  • pytorch 1.1+
  • torchvision 0.3+
  • pyclipper
  • opencv3
  • gcc 4.9+

Download

PAN_resnet18_FPEM_FFM and PAN_resnet18_FPEM_FFM on icdar2015:

the updated model(resnet18:78.8,shufflenetv2: 72.4,lr:le-3) is not the best model

google drive

Data Preparation

train: prepare a text in the following format, use '\t' as a separator

/path/to/img.jpg path/to/label.txt
...

val: use a folder

img/ store img
gt/ store gt file

Train

  1. config the train_data_path,val_data_pathin config.json
  2. use following script to run
python3 train.py

Test

eval.py is used to test model on test dataset

  1. config model_path, img_path, gt_path, save_path in eval.py
  2. use following script to test
python3 eval.py

Predict

predict.py is used to inference on single image

  1. config model_path, img_path, in predict.py
  2. use following script to predict
python3 predict.py

The project is still under development.

Performance

ICDAR 2015

only train on ICDAR2015 dataset

Method image size (short size) learning rate Precision (%) Recall (%) F-measure (%) FPS
paper(resnet18) 736 x x x 80.4 26.1
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-3 81.72 66.73 73.47 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-3 84.93 74.09 79.14 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-3 84.23 76.12 79.96 14.22 (P100)
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-4 75.14 57.34 65.04 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-4 83.89 69.23 75.86 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-4 85.29 75.1 79.87 14.22 (P100)
my (resnet18+FPN+pse扩张) 736 1e-3 76.50 74.70 75.59 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-3 71.82 75.73 73.72 10.67 (P100)
my (resnet18+FPN+pse扩张) 736 1e-4 74.19 72.34 73.25 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-4 78.96 76.27 77.59 10.67 (P100)

examples

todo

  • MobileNet backbone

  • ShuffleNet backbone

reference

  1. https://arxiv.org/pdf/1908.05900.pdf
  2. https://github.com/WenmuZhou/PSENet.pytorch

If this repository helps you,please star it. Thanks.

Owner
zhoujun
深度学习工程师,最近准备做端侧
zhoujun
DeepSpamReview: Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures. Summer Internship project at CoreView Systems.

Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures Dataset: https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polar

Ashish Salunkhe 37 Dec 17, 2022
covid question answering datasets and fine tuned models

Covid-QA Fine tuned models for question answering on Covid-19 data. Hosted Inference This model has been contributed to huggingface.Click here to see

Abhijith Neil Abraham 19 Sep 09, 2021
YOLOX-Paddle - A reproduction of YOLOX by PaddlePaddle

YOLOX-Paddle A reproduction of YOLOX by PaddlePaddle 数据集准备 下载COCO数据集,准备为如下路径 /ho

QuanHao Guo 6 Dec 18, 2022
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023
FTIR-Deep Learning - FTIR Deep Learning With Python

CANDIY-spectrum Human analyis of chemical spectra such as Mass Spectra (MS), Inf

Wei Mei 1 Jan 03, 2022
PyTorch module to use OpenFace's nn4.small2.v1.t7 model

OpenFace for Pytorch Disclaimer: This codes require the input face-images that are aligned and cropped in the same way of the original OpenFace. * I m

Pete Tae-hoon Kim 176 Dec 12, 2022
HMLET (Hybrid-Method-of-Linear-and-non-linEar-collaborative-filTering-method)

Methods HMLET (Hybrid-Method-of-Linear-and-non-linEar-collaborative-filTering-method) Dynamically selecting the best propagation method for each node

Yong 7 Dec 18, 2022
Code release for ConvNeXt model

A ConvNet for the 2020s Official PyTorch implementation of ConvNeXt, from the following paper: A ConvNet for the 2020s. arXiv 2022. Zhuang Liu, Hanzi

Meta Research 4.6k Jan 08, 2023
🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Cogitare is a Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python. A friendly interface for beginners and a powerful too

Cogitare - Modern and Easy Deep Learning with Python 76 Sep 30, 2022
ilpyt: imitation learning library with modular, baseline implementations in Pytorch

ilpyt The imitation learning toolbox (ilpyt) contains modular implementations of common deep imitation learning algorithms in PyTorch, with unified in

The MITRE Corporation 11 Nov 17, 2022
PyTorch implementation of residual gated graph ConvNets, ICLR’18

Residual Gated Graph ConvNets April 24, 2018 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbress

Xavier Bresson 112 Aug 10, 2022
TANL: Structured Prediction as Translation between Augmented Natural Languages

TANL: Structured Prediction as Translation between Augmented Natural Languages Code for the paper "Structured Prediction as Translation between Augmen

98 Dec 15, 2022
Piotr - IoT firmware emulation instrumentation for training and research

Piotr: Pythonic IoT exploitation and Research Introduction to Piotr Piotr is an emulation helper for Qemu that provides a convenient way to create, sh

Damien Cauquil 51 Nov 09, 2022
MERLOT: Multimodal Neural Script Knowledge Models

merlot MERLOT: Multimodal Neural Script Knowledge Models MERLOT is a model for learning what we are calling "neural script knowledge" -- representatio

Rowan Zellers 190 Dec 22, 2022
This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

19 Oct 27, 2022
Band-Adaptive Spectral-Spatial Feature Learning Neural Network for Hyperspectral Image Classification

Band-Adaptive Spectral-Spatial Feature Learning Neural Network for Hyperspectral Image Classification

258 Dec 29, 2022
Setup freqtrade/freqUI on Heroku

UNMAINTAINED - REPO MOVED TO https://github.com/p-zombie/freqtrade Creating the app git clone https://github.com/joaorafaelm/freqtrade.git && cd freqt

João 51 Aug 29, 2022
Multi-resolution SeqMatch based long-term Place Recognition

MRS-SLAM for long-term place recognition In this work, we imply an multi-resolution sambling based visual place recognition method. This work is based

METASLAM 6 Dec 06, 2022
PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time The implementation is based on SIGGRAPH Aisa'20. Dependencies Python 3.7 Ubuntu

soratobtai 124 Dec 08, 2022
Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Interactive Machine Comprehension with Dynamic Knowledge Graphs Implementation for the EMNLP 2021 paper. Dependencies apt-get -y update apt-get instal

Xingdi (Eric) Yuan 19 Aug 23, 2022