Benchmarking the robustness of Spatial-Temporal Models

Overview

Benchmarking the robustness of Spatial-Temporal Models

This repositery contains the code for the paper Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions.

Python 2.7 and 3.7, Pytorch 1.7+, FFmpeg are required.

Requirements

pip3 install - requirements.txt

Mini Kinetics-C

image info

Download original Kinetics400 from link.

The Mini Kinetics-C contains half of the classes in Kinetics400. All the classes can be found in mini-kinetics-200-classes.txt.

Mini Kinetics-C Leaderboard

Corruption robustness of spatial-temporal models trained on clean Mini Kinetics and evaluated on Mini Kinetics-C.

Approach Reference Backbone Input Length Sampling Method Clean Accuracy mPC rPC
TimeSformer Gedas et al. Transformer 32 Uniform 82.2 71.4 86.9
3D ResNet K. Hara et al. ResNet-50 32 Uniform 73.0 59.2 81.1
I3D J. Carreira et al. InceptionV1 32 Uniform 70.5 57.7 81.8
SlowFast 8x4 C. Feichtenhofer at al. ResNet-50 32 Uniform 69.2 54.3 78.5
3D ResNet K. Hara et al. ResNet-18 32 Uniform 66.2 53.3 80.5
TAM Q.Fan et al. ResNet-50 32 Uniform 66.9 50.8 75.9
X3D-M C. Feichtenhofer ResNet-50 32 Uniform 62.6 48.6 77.6

For fair comparison, it is recommended to submit the result of approach which follows the following settings: Backbone of ResNet-50, Input Length of 32, Uniform Sampling at Clip Level. Any result on our benchmark can be submitted via pull request.

Mini SSV2-C

image info

Download original Something-Something-V2 datset from link.

The Mini SSV2-C contains half of the classes in Something-Something-V2. All the classes can be found in mini-ssv2-87-classes.txt.

Mini SSV2-C Leaderboard

Corruption robustness of spatial-temporal models trained on clean Mini SSV2 and evaluated on Mini SSV2-C.

Approach Reference Backbone Input Length Sampling Method Clean Accuracy mPC rPC
TimeSformer Gedas et al. Transformer 16 Uniform 60.5 49.7 82.1
I3D J. Carreira et al. InceptionV1 32 Uniform 58.5 47.8 81.7
3D ResNet K. Hara et al. ResNet-50 32 Uniform 57.4 46.6 81.2
TAM Q.Fan et al. ResNet-50 32 Uniform 61.8 45.7 73.9
3D ResNet K. Hara et al. ResNet-18 32 Uniform 53.0 42.6 80.3
X3D-M C. Feichtenhofer ResNet-50 32 Uniform 49.9 40.7 81.6
SlowFast 8x4 C. Feichtenhofer at al. ResNet-50 32 Uniform 48.7 38.4 78.8

For fair comparison, it is recommended to submit the result of approach which follows the following settings: Backbone of ResNet-50, Input Length of 32, Uniform Sampling at Clip Level. Any result on our benchmark can be submitted via pull request.

Training and Evaluation

To help researchers reproduce the benchmark results provided in our leaderboard, we include a simple framework for training and evaluating the spatial-temporal models in the folder: benchmark_framework.

Running the code

Assume the structure of data directories is the following:

~/
  datadir/
    mini_kinetics/
      train/
        .../ (directories of class names)
          ...(hdf5 file containing video frames)
    mini_kinetics-c/
      .../ (directories of corruption names)
        .../ (directories of severity level)
          .../ (directories of class names)
            ...(hdf5 file containing video frames)

Train I3D on the Mini Kinetics dataset with 4 GPUs and 16 CPU threads (for data loading). The input lenght is 32, the batch size is 32 and learning rate is 0.01.

python3 train.py --threed_data --dataset mini_kinetics400 --frames_per_group 1 --groups 32 --logdir snapshots/ \
--lr 0.01 --backbone_net i3d -b 32 -j 16 --cuda 0,1,2,3

Test I3D on the Mini Kinetics-C dataset (pretrained model is loaded)

python3 test_corruption.py --threed_data --dataset mini_kinetics400 --frames_per_group 1 --groups 32 --logdir snapshots/ \
--pretrained snapshots/mini_kinetics400-rgb-i3d_v2-ts-max-f32-cosine-bs32-e50-v1/model_best.pth.tar --backbone_net i3d -b 32 -j 16 -e --cuda 0,1,2,3

Owner
Yi Chenyu Ian
Yi Chenyu Ian
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

538 Jan 09, 2023
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

Facebook Research 145 Dec 30, 2022
Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

TensorFlow White Paper Notes Features Notes broken down section by section, as well as subsection by subsection Relevant links to documentation, resou

Sam Abrahams 437 Oct 09, 2022
An end-to-end library for editing and rendering motion of 3D characters with deep learning [SIGGRAPH 2020]

Deep-motion-editing This library provides fundamental and advanced functions to work with 3D character animation in deep learning with Pytorch. The co

1.2k Dec 29, 2022
A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

idn-solver Paper | Project Page This repository contains the code release of our ICCV 2021 paper: A Confidence-based Iterative Solver of Depths and Su

zhaowang 43 Nov 17, 2022
Just Go with the Flow: Self-Supervised Scene Flow Estimation

Just Go with the Flow: Self-Supervised Scene Flow Estimation Code release for the paper Just Go with the Flow: Self-Supervised Scene Flow Estimation,

Himangi Mittal 50 Nov 22, 2022
Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

Image Translation with ASAPNets Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021 Webpage | Paper | Video Installation insta

Tamar Rott Shaham 100 Dec 28, 2022
Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Exploring Cross-Image Pixel Contrast for Semantic Segmentation Exploring Cross-Image Pixel Contrast for Semantic Segmentation, Wenguan Wang, Tianfei Z

Tianfei Zhou 510 Jan 02, 2023
Pytorch and Keras Implementations of Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects.

The repository contains the implementations for Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects. Model

Ankur Deria 115 Jan 06, 2023
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

WaveGrad2 - PyTorch Implementation PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Status (202

Keon Lee 59 Dec 06, 2022
Python scripts form performing stereo depth estimation using the HITNET model in ONNX.

ONNX-HITNET-Stereo-Depth-estimation Python scripts form performing stereo depth estimation using the HITNET model in ONNX. Stereo depth estimation on

Ibai Gorordo 30 Nov 08, 2022
A python library to build Model Trees with Linear Models at the leaves.

A python library to build Model Trees with Linear Models at the leaves.

Marco Cerliani 212 Dec 30, 2022
InsCLR: Improving Instance Retrieval with Self-Supervision

InsCLR: Improving Instance Retrieval with Self-Supervision This is an official PyTorch implementation of the InsCLR paper. Download Dataset Dataset Im

Zelu Deng 25 Aug 30, 2022
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,

Theis Lab 968 Dec 28, 2022
Real-time object detection on Android using the YOLO network with TensorFlow

TensorFlow YOLO object detection on Android Source project android-yolo is the first implementation of YOLO for TensorFlow on an Android device. It is

Nataniel Ruiz 624 Jan 03, 2023
FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

FCOS: Fully Convolutional One-Stage Object Detection This project hosts the code for implementing the FCOS algorithm for object detection, as presente

Tian Zhi 3.1k Jan 05, 2023
Code for paper "Multi-level Disentanglement Graph Neural Network"

Multi-level Disentanglement Graph Neural Network (MD-GNN) This is a PyTorch implementation of the MD-GNN, and the code includes the following modules:

Lirong Wu 6 Dec 29, 2022
3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

Владислав Молодцов 0 Feb 06, 2022
EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

EgonNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale Paper: EgoNN: Egocentric Neural Network for Point Cloud

19 Sep 20, 2022
A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

LPM_Python A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching. The code is established ac

AoxiangFan 11 Nov 07, 2022