The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Last update: Nov 14, 2022

Related tags

Deep Learning weak-sup-visual-grounding

Overview

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

This repository is the official implementation of CVPR 2021 paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Requirements

Tensorflow-1-15

Training

To train the NCE model(s) in the paper, run this command:

python train_nce_distill_model.py \
  --region_feat_path=region_features.hdf5 \
  --phrase_feat_path=phrase_features.hdf5 \
  --glove_path=glove.hdf5

To train the NCE+Distill model(s) in the paper, run this command:

python train_nce_distill_model.py \
  --region_feat_path=region_features.hdf5 \
  --phrase_feat_path=phrase_features.hdf5 \
  --glove_path=glove.hdf5 \
  --phrase_to_label_json=phrase_to_label.json

Evaluation

To evaluate the model on Flickr30K, run:

python eval_model.py \
  --region_feat_path=region_features_test.hdf5 \
  --phrase_feat_path=phrase_features_test.hdf5 \
  --glove_path=glove.hdf5 \
  --restore_path=checkpoint.meta

Pre-trained Models

You can download pretrained models using Res101 VG features here:

You can also find the features on Flickr30K test split here.

The pretrained models achieve the following performance on Flickr30K test split:

Model Name	[email protected]	[email protected]	[email protected]
NCE+Distill	0.5310	0.7394	0.7875
NCE	0.5135	0.7338	0.7833

Citation

If you use our implementation in your research or wish to refer to the results published in our paper, please use the following BibTeX entry.

@InProceedings{Wang_2021_CVPR,
    author    = {Wang, Liwei and Huang, Jing and Li, Yin and Xu, Kun and Yang, Zhengyuan and Yu, Dong},
    title     = {Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {14090-14100}
}

The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Related tags

Overview

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Requirements

Training

Evaluation

Pre-trained Models

Citation

Owner

Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

A complete, self-contained example for training ImageNet at state-of-the-art speed with FFCV

Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

Graph Convolutional Networks in PyTorch

Gin provides a lightweight configuration framework for Python

Multi-objective constrained optimization for energy applications via tree ensembles

Google Landmark Recogntion and Retrieval 2021 Solutions

Lucid library adapted for PyTorch

DTCN IJCAI - Sequential prediction learning framework and algorithm

Simple implementation of Mobile-Former on Pytorch

Learning Dense Representations of Phrases at Scale (Lee et al., 2020)

Locally cache assets that are normally streamed in POPULATION: ONE

Multilingual Image Captioning

Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.

Two types of Recommender System : Content-based Recommender System and Colaborating filtering based recommender system

A PyTorch Implementation of Single Shot MultiBox Detector

A developer interface for creating Chat AIs for the Chai app.

Finetune alexnet with tensorflow - Code for finetuning AlexNet in TensorFlow >= 1.2rc0