Pytorch implementation for RelTransformer

Overview

RelTransformer

Our Architecture

image

This is a Pytorch implementation for RelTransformer

The implementation for Evaluating on VG200 can be found here

Requirements

conda env create -f reltransformer_env.yml

Compilation

Compile the CUDA code in the Detectron submodule and in the repo:

cd $ROOT/lib
sh make.sh

Annotations

create a data folder at the top-level directory of the repository

# ROOT = path/to/cloned/repository
cd $ROOT
mkdir data

GQA

Download it here. Unzip it under the data folder. You should see a gvqa folder unzipped there. It contains seed folder called seed0 that contains .json annotations that suit the dataloader used in this repo.

Visual Genome

Download it here. Unzip it under the data folder. You should see a vg8k folder unzipped there. It contains seed folder called seed3 that contains .json annotations that suit the dataloader used in this repo.

Word2Vec Vocabulary

Create a folder named word2vec_model under data. Download the Google word2vec vocabulary from here. Unzip it under the word2vec_model folder and you should see GoogleNews-vectors-negative300.bin there.

Images

GQA

Create a folder for all images:

# ROOT=path/to/cloned/repository
cd $ROOT/data/gvqa
mkdir images

Download GQA images from the here

Visual Genome

Create a folder for all images:

# ROOT=path/to/cloned/repository
cd $ROOT/data/vg8k
mkdir VG_100K

Download Visual Genome images from the official page. Unzip all images (part 1 and part 2) into VG_100K/. There should be a total of 108249 files.

Pre-trained Object Detection Models

Download pre-trained object detection models here. Unzip it under the root directory and you should see a detection_models folder there.

Evaluating Pre-trained Relationship Detection models

DO NOT CHANGE anything in the provided config files(configs/xx/xxxx.yaml) even if you want to test with less or more than 8 GPUs. Use the environment variable CUDA_VISIBLE_DEVICES to control how many and which GPUs to use. Remove the --multi-gpu-test for single-gpu inference.

Training Relationship Detection Models

It requires 8 GPUS for trianing.

GVQA

Train our relationship network using a VGG16 backbone, run

python -u tools/train_net_reltransformer.py --dataset gvqa --cfg configs/gvqa/e2e_relcnn_VGG16_8_epochs_gvqa_reltransformer.yaml --nw 8 --use_tfboard --seed 1 

Train our relationship network using a VGG16 backbone with WCE loss, run

python -u tools/train_net_reltransformer_WCE.py --dataset gvqa --cfg configs/gvqa/e2e_relcnn_VGG16_8_epochs_gvqa_reltransformer_WCE.yaml --nw 8 --use_tfboard --seed 1

To test the trained networks, run

python tools/test_net_reltransformer.py --dataset gvqa --cfg configs/gvqa/e2e_relcnn_VGG16_8_epochs_gvqa_reltransformer.yaml --load_ckpt  model-path  --use_gt_boxes --use_gt_labels --do_val

To test the trained networks, run

python tools/test_net_reltransformer_WCE.py --dataset gvqa --cfg configs/gvqa/e2e_relcnn_VGG16_8_epochs_gvqa_reltransformer_WCE.yaml --load_ckpt  model-path  --use_gt_boxes --use_gt_labels --do_val

VG8K

Train our relationship network using a VGG16 backbone, run

python -u tools/train_net_reltransformer.py --dataset vg8k --cfg configs/vg8k/e2e_relcnn_VGG16_8_epochs_vg8k_reltransformer.yaml  --nw 8 --use_tfboard --seed 3

Train our relationship network using a VGG16 backbone with WCE loss, run

python -u tools/train_net_reltransformer_wce.py --dataset vg8k --cfg configs/vg8k/e2e_relcnn_VGG16_8_epochs_vg8k_reltransformer_wce.yaml --nw 8 --use_tfboard --seed3

To test the trained networks, run

python tools/test_net_reltransformer.py --dataset vg8k --cfg configs/vg8k/e2e_relcnn_VGG16_8_epochs_vg8k_reltransformer.yaml --load_ckpt  model-path  --use_gt_boxes --use_gt_labels --do_val

To test the trained model with WCE loss function, run

python tools/test_net_reltransformer_wce.py --dataset vg8k --cfg configs/vg8k/e2e_relcnn_VGG16_8_epochs_vg8k_reltransformer_wce.yaml --load_ckpt  model-path  --use_gt_boxes --use_gt_labels --do_val

Acknowledgements

This repository uses code based on the LTVRD source code by sherif, as well as code from the Detectron.pytorch repository by Roy Tseng.

Citing

If you use this code in your research, please use the following BibTeX entry.

@article{chen2021reltransformer,
  title={RelTransformer: Balancing the Visual Relationship Detection from Local Context, Scene and Memory},
  author={Chen, Jun and Agarwal, Aniket and Abdelkarim, Sherif and Zhu, Deyao and Elhoseiny, Mohamed},
  journal={arXiv preprint arXiv:2104.11934},
  year={2021}
}

Owner
Vision CAIR Research Group, KAUST
Vision CAIR Group, KAUST, supported by Mohamed Elhoseiny
Vision CAIR Research Group, KAUST
Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System This repository contains code for the paper Schultheis,

2 Oct 28, 2022
Generative Handwriting using LSTM Mixture Density Network with TensorFlow

Generative Handwriting Demo using TensorFlow An attempt to implement the random handwriting generation portion of Alex Graves' paper. See my blog post

hardmaru 686 Nov 24, 2022
Experimental solutions to selected exercises from the book [Advances in Financial Machine Learning by Marcos Lopez De Prado]

Advances in Financial Machine Learning Exercises Experimental solutions to selected exercises from the book Advances in Financial Machine Learning by

Brian 1.4k Jan 04, 2023
NeurIPS 2021 Datasets and Benchmarks Track

AP-10K: A Benchmark for Animal Pose Estimation in the Wild Introduction | Updates | Overview | Download | Training Code | Key Questions | License Intr

AP-10K 82 Dec 11, 2022
Codebase for testing whether hidden states of neural networks encode discrete structures.

structural-probes Codebase for testing whether hidden states of neural networks encode discrete structures. Based on the paper A Structural Probe for

John Hewitt 349 Dec 17, 2022
Python based framework for Automatic AI for Regression and Classification over numerical data.

Python based framework for Automatic AI for Regression and Classification over numerical data. Performs model search, hyper-parameter tuning, and high-quality Jupyter Notebook code generation.

BlobCity, Inc 141 Dec 21, 2022
PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data

Use PyMove and go much further Information Package Status License Python Version Platforms Build Status PyPi version PyPi Downloads Conda version Cond

Insight Data Science Lab 64 Nov 15, 2022
Codes for the ICCV'21 paper "FREE: Feature Refinement for Generalized Zero-Shot Learning"

FREE This repository contains the reference code for the paper "FREE: Feature Refinement for Generalized Zero-Shot Learning". [arXiv][Paper] 1. Prepar

Shiming Chen 28 Jul 29, 2022
Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

Sukrut Rao 32 Dec 13, 2022
PyTorch reimplementation of REALM and ORQA

PyTorch reimplementation of REALM and ORQA

Li-Huai (Allan) Lin 17 Aug 20, 2022
Deep Q Learning with OpenAI Gym and Pokemon Showdown

pokemon-deep-learning An openAI gym project for pokemon involving deep q learning. Made by myself, Sam Little, and Layton Webber. This code captures g

2 Dec 22, 2021
Examples of how to create colorful, annotated equations in Latex using Tikz.

The file "eqn_annotate.tex" is the main latex file. This repository provides four examples of annotated equations: [example_prob.tex] A simple one ins

SyNeRCyS Research Lab 3.2k Jan 05, 2023
Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021)

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation (NeurIPS 2021) The implementation of Reducing Infromation Bottleneck for W

Jungbeom Lee 81 Dec 16, 2022
This is an early in-development version of training CLIP models with hivemind.

A transformer that does not hog your GPU memory This is an early in-development codebase: if you want a stable and documented hivemind codebase, look

<a href=[email protected]"> 4 Nov 06, 2022
Rule based classification A hotel s customers dataset

Rule-based-classification-A-hotel-s-customers-dataset- Aim: Categorize new customers by segment and predict how much revenue they can generate This re

Şebnem 4 Jan 02, 2022
Provide baselines and evaluation metrics of the task: traffic flow prediction

Note: This repo is adpoted from https://github.com/UNIMIBInside/Smart-Mobility-Prediction. Due to technical reasons, I did not fork their code. Introd

Zhangzhi Peng 11 Nov 02, 2022
The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds The why Im

3 Mar 29, 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

Jaemin Cho 98 Dec 15, 2022
Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner This repository is the official implementation of Meta-rPPG: Remote Heart Ra

Eugene Lee 137 Dec 13, 2022
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

Princeton Natural Language Processing 119 Sep 28, 2022