The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Last update: Apr 20, 2022

Related tags

Overview

The Code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Setting up and using the repo

Get the dataset. Follow the steps in data/README.md. This includes the steps to get the pretrained BERT embeddings and visual representations.
Install cuda 11.0 if it's not available already.
Install anaconda if it's not available already, and create a new environment. You need to install a few things, namely, pytorch 1.7.1, torchvision, and allennlp.

wget https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh
conda update -n base -c defaults conda
conda create --name MCC python=3.6
source activate MCC

conda install numpy pyyaml setuptools cmake cffi tqdm pyyaml scipy ipython mkl mkl-include cython typing h5py pandas nltk spacy numpydoc scikit-learn jpeg

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=11.0 -c pytorch

pip install -r allennlp-requirements.txt
pip install --no-deps allennlp==0.8.0
python -m spacy download en_core_web_sm


# this one is optional but it should help make things faster
pip uninstall pillow && CC="cc -mavx2" pip install -U --force-reinstall pillow-simd

That's it! Now to set up the environment, run source activate MCC.

Train/Evaluate models

Please refer to models/README.md.

Acknowledgement

We refer to the repo r2c and tab-vcr for preprocessing codes.

Cite

@inproceedings{zhang2021multi,
  title={Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning},
  author={Zhang, Xi and Zhang, Feifei and Xu, Changsheng},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={1793--1802},
  year={2021}
}

The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Related tags

Overview

The Code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Setting up and using the repo

Train/Evaluate models

Acknowledgement

Cite

Owner

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Training RNNs as Fast as CNNs

Paper: De-rendering Stylized Texts

M3DSSD: Monocular 3D Single Stage Object Detector

IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

Finite-temperature variational Monte Carlo calculation of uniform electron gas using neural canonical transformation.

Code for Massive-scale Decoding for Text Generation using Lattices

EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

zeus is a Python implementation of the Ensemble Slice Sampling method.

The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

A Python library created to assist programmers with complex mathematical functions

library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

Config files for my GitHub profile.

Code for the tech report Toward Training at ImageNet Scale with Differential Privacy

Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python

Convolutional Neural Network for Text Classification in Tensorflow

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"