Sample and Computation Redistribution for Efficient Face Detection

Last update: Mar 05, 2022

Related tags

Deep Learning SCR-Face-Detection

Overview

Introduction

SCRFD is an efficient high accuracy face detection approach which initially described in Arxiv.

Performance

Precision, flops and infer time are all evaluated on VGA resolution.

ResNet family

Method	Backbone	Easy	Medium	Hard	#Params(M)	#Flops(G)	Infer(ms)
DSFD (CVPR19)	ResNet152	94.29	91.47	71.39	120.06	259.55	55.6
RetinaFace (CVPR20)	ResNet50	94.92	91.90	64.17	29.50	37.59	21.7
HAMBox (CVPR20)	ResNet50	95.27	93.76	76.75	30.24	43.28	25.9
TinaFace (Arxiv20)	ResNet50	95.61	94.25	81.43	37.98	172.95	38.9
-	-	-	-	-	-	-	-
ResNet-34GF	ResNet50	95.64	94.22	84.02	24.81	34.16	11.8
SCRFD-34GF	Bottleneck Res	96.06	94.92	85.29	9.80	34.13	11.7
ResNet-10GF	ResNet34x0.5	94.69	92.90	80.42	6.85	10.18	6.3
SCRFD-10GF	Basic Res	95.16	93.87	83.05	3.86	9.98	4.9
ResNet-2.5GF	ResNet34x0.25	93.21	91.11	74.47	1.62	2.57	5.4
SCRFD-2.5GF	Basic Res	93.78	92.16	77.87	0.67	2.53	4.2

Mobile family

Method	Backbone	Easy	Medium	Hard	#Params(M)	#Flops(G)	Infer(ms)
RetinaFace (CVPR20)	MobileNet0.25	87.78	81.16	47.32	0.44	0.802	7.9
FaceBoxes (IJCB17)	-	76.17	57.17	24.18	1.01	0.275	2.5
-	-	-	-	-	-	-	-
MobileNet-0.5GF	MobileNetx0.25	90.38	87.05	66.68	0.37	0.507	3.7
SCRFD-0.5GF	Depth-wise Conv	90.57	88.12	68.51	0.57	0.508	3.6

X64 CPU Performance of SCRFD-0.5GF:

Test-Input-Size	CPU Single-Thread	Easy	Medium	Hard
Original-Size(scale1.0)	-	90.91	89.49	82.03
640x480	28.3ms	90.57	88.12	68.51
320x240	11.4ms	-	-	-

precision and infer time are evaluated on AMD Ryzen 9 3950X, using the simple PyTorch CPU inference by setting OMP_NUM_THREADS=1 (no mkldnn).

Installation

Please refer to mmdetection for installation.

Install mmcv. (mmcv-full==1.2.6 and 1.3.3 was tested)

Install build requirements and then install mmdet.

pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"

Pretrained-Models

Name	Easy	Medium	Hard	FLOPs	Params(M)	Infer(ms)	Link
SCRFD_500M	90.57	88.12	68.51	500M	0.57	3.6	download
SCRFD_1G	92.38	90.57	74.80	1G	0.64	4.1	download
SCRFD_2.5G	93.78	92.16	77.87	2.5G	0.67	4.2	download
SCRFD_10G	95.16	93.87	83.05	10G	3.86	4.9	download
SCRFD_34G	96.06	94.92	85.29	34G	9.80	11.7	download
SCRFD_500M_KPS	90.97	88.44	69.49	500M	0.57	3.6	download
SCRFD_2.5G_KPS	93.80	92.02	77.13	2.5G	0.82	4.3	download
SCRFD_10G_KPS	95.40	94.01	82.80	10G	4.23	5.0	download

mAP, FLOPs and inference latency are all evaluated on VGA resolution. _KPS means the model includes 5 keypoints prediction.

Convert to ONNX

Please refer to tools/scrfd2onnx.py

Generated onnx model can accept dynamic input as default.

You can also set specific input shape by pass --shape 640 640, then output onnx model can be optimized by onnx-simplifier.

Inference

Put your input images or videos in ./input directory. The output will be saved in ./output directory. In root directory of project, run the following command for image:

python inference_image.py --input "./input/test.jpg"

and for video:

python inference_video.py --input "./input/obama.mp4"

Use -sh for show results during code running or not

Note that you can pass some other arguments. Take a look at inference_video.py file.

Sample and Computation Redistribution for Efficient Face Detection

Related tags

Overview

Introduction

Performance

ResNet family

Mobile family

Installation

Pretrained-Models

Convert to ONNX

Inference

Owner

Sajjad Aemmi

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

ReferFormer - Official Implementation of ReferFormer

PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

PyTorch implementation of CloudWalk's recent work DenseBody

Detecting drunk people through thermal images using Deep Learning (CNN)

Deep Learning and Logical Reasoning from Data and Knowledge

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization

Jremesh-tools - Blender addon for quad remeshing

High-resolution networks and Segmentation Transformer for Semantic Segmentation

Demonstrates iterative FGSM on Apple's NeuralHash model.

Self-Correcting Quantum Many-Body Control using Reinforcement Learning with Tensor Networks

Sample and Computation Redistribution for Efficient Face Detection

Related tags

Overview

Introduction

Performance

ResNet family

Mobile family

Installation

Pretrained-Models

Convert to ONNX

Inference

Owner

Sajjad Aemmi

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

BERT model training impelmentation using 1024 A100 GPUs for MLPerf Training v1.1

ReferFormer - Official Implementation of ReferFormer

PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

PyTorch implementation of CloudWalk's recent work DenseBody

Detecting drunk people through thermal images using Deep Learning (CNN)

Deep Learning and Logical Reasoning from Data and Knowledge

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization

Jremesh-tools - Blender addon for quad remeshing

High-resolution networks and Segmentation Transformer for Semantic Segmentation

Demonstrates iterative FGSM on Apple's NeuralHash model.

Self-Correcting Quantum Many-Body Control using Reinforcement Learning with Tensor Networks

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.