Exploring Visual Engagement Signals for Representation Learning

Related tags

Deep Learningvise
Overview

Exploring Visual Engagement Signals for Representation Learning

Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie and Ser-Nam Lim
Cornell University, Facebook AI


arXiv: https://arxiv.org/abs/2104.07767

common supervisory signals
VisE as supervisory signals.

VisE is a pretraining approach which leverages Visual Engagement clues as supervisory signals. Given the same image, visual engagement provide semantically and contextually richer information than conventional recognition and captioning tasks. VisE transfers well to subjective downstream computer vision tasks like emotion recognition or political bias classification.

💬 Loading pretrained models

NOTE: This is a torchvision-like model (all the layers before the last global average-pooling layer.). Given a batch of image tensors with size (B, 3, 224, 224), the provided models produce spatial image features of shape (B, 2048, 7, 7), where B is the batch size.

Loading models with torch.hub

Get the pretrained ResNet-50 models from VisE in one line!

VisE-250M (ResNet-50): this model is pretrained with 250 million public image posts.

import torch
model = torch.hub.load("KMnP/vise", "resnet50_250m", pretrained=True)

VisE-1.2M (ResNet-50): This model is pretrained with 1.23 million public image posts.

import torch
model = torch.hub.load("KMnP/vise", "resnet50_1m", pretrained=True)

Loading models manually

Arch Size Model
VisE-250M ResNet-50 94.3 MB download
VisE-1.2M ResNet-50 94.3 MB download

If you encounter any issues with torch.hub, alternatively you can download the model checkpoints manually, and then following the script below.

import torch
import torchvision

# Create a torchvision resnet50 with randomly initialized weights.
model = torchvision.models.resnet50(pretrained=False)

# Get the model before the global aver-pooling layer.
model = torch.nn.Sequential(*list(model.children())[:-2])

# load the pretrained model from a local path: <CHECKPOINT_PATH>:
model.load_state_dict(torch.load(CHECKPOINT_PATH))

💬 Citing VisE

If you find VisE useful in your research, please cite the following publication.

@misc{jia2021vise,
      title={Exploring Visual Engagement Signals for Representation Learning}, 
      author={Menglin Jia and Zuxuan Wu and Austin Reiter and Claire Cardie and Serge Belongie and Ser-Nam Lim},
      year={2021},
      eprint={2104.07767},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

💬 Acknowledgments

We thank Marseille who was featured in the teaser photo.

💬 License

VisE models are released under the CC-BY-NC 4.0 license. See LICENSE for additional details.

Owner
Menglin Jia
K-Mn-P: "jia meng lin" (mandarin pronunciation of those chemical elements)
Menglin Jia
Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.

PyOpenVINO - An Experimental Python Implementation of OpenVINO Inference Engine (minimum-set) Description The PyOpenVINO is a spin-off product from my

Yasunori Shimura 7 Oct 31, 2022
PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

PSTR (CVPR2022) This code is an official implementation of "PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)". End-to-end one-step

Jiale Cao 28 Dec 13, 2022
A supplementary code for Editable Neural Networks, an ICLR 2020 submission.

Editable neural networks A supplementary code for Editable Neural Networks, an ICLR 2020 submission by Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry Py

Anton Sinitsin 32 Nov 29, 2022
LWCC: A LightWeight Crowd Counting library for Python that includes several pretrained state-of-the-art models.

LWCC: A LightWeight Crowd Counting library for Python LWCC is a lightweight crowd counting framework for Python. It wraps four state-of-the-art models

Matija Teršek 39 Dec 28, 2022
Godot RL Agents is a fully Open Source packages that allows video game creators

Godot RL Agents The Godot RL Agents is a fully Open Source packages that allows video game creators, AI researchers and hobbiest the opportunity to le

Edward Beeching 326 Dec 30, 2022
Official PyTorch code for the paper: "Point-Based Modeling of Human Clothing" (ICCV 2021)

Point-Based Modeling of Human Clothing Paper | Project page | Video This is an official PyTorch code repository of the paper "Point-Based Modeling of

Visual Understanding Lab @ Samsung AI Center Moscow 64 Nov 22, 2022
[ICLR 2021] Is Attention Better Than Matrix Decomposition?

Enjoy-Hamburger 🍔 Official implementation of Hamburger, Is Attention Better Than Matrix Decomposition? (ICLR 2021) Under construction. Introduction T

Gsunshine 271 Dec 29, 2022
Self-labelling via simultaneous clustering and representation learning. (ICLR 2020)

Self-labelling via simultaneous clustering and representation learning 🆗 🆗 🎉 NEW models (20th August 2020): Added standard SeLa pretrained torchvis

Yuki M. Asano 469 Jan 02, 2023
Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper

LEXA Benchmark Codebase for the self-supervised goal reaching benchmark introduced in the LEXA paper (Discovering and Achieving Goals via World Models

Oleg Rybkin 36 Dec 22, 2022
3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces (ICCV 2021)

3DIAS_Pytorch This repository contains the official code to reproduce the results from the paper: 3DIAS: 3D Shape Reconstruction with Implicit Algebra

Mohsen Yavartanoo 21 Dec 12, 2022
Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

Sanja Fidler's Lab 52 Nov 22, 2022
[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)

Are Transformers More Robust Than CNNs? Pytorch implementation for NeurIPS 2021 Paper: Are Transformers More Robust Than CNNs? Our implementation is b

Yutong Bai 145 Dec 01, 2022
Band-Adaptive Spectral-Spatial Feature Learning Neural Network for Hyperspectral Image Classification

Band-Adaptive Spectral-Spatial Feature Learning Neural Network for Hyperspectral Image Classification

258 Dec 29, 2022
Regression Metrics Calculation Made easy for tensorflow2 and scikit-learn

Regression Metrics Installation To install the package from the PyPi repository you can execute the following command: pip install regressionmetrics I

Ashish Patel 11 Dec 16, 2022
Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Towards Diverse Paragraph Captioning for Untrimmed Videos This repository contains PyTorch implementation of our paper Towards Diverse Paragraph Capti

Yuqing Song 61 Oct 11, 2022
本项目是一个带有前端界面的垃圾分类项目,加载了训练好的模型参数,模型为efficientnetb4,暂时为40分类问题。

说明 本项目是一个带有前端界面的垃圾分类项目,加载了训练好的模型参数,模型为efficientnetb4,暂时为40分类问题。 python依赖 tf2.3 、cv2、numpy、pyqt5 pyqt5安装 pip install PyQt5 pip install PyQt5-tools 使用 程

4 May 04, 2022
Dynamica causal Bayesian optimisation

Dynamic Causal Bayesian Optimization This is a Python implementation of Dynamic Causal Bayesian Optimization as presented at NeurIPS 2021. Abstract Th

nd308 18 Nov 22, 2022
Implementation of character based convolutional neural network

Character Based CNN This repo contains a PyTorch implementation of a character-level convolutional neural network for text classification. The model a

Ahmed BESBES 248 Nov 21, 2022
PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

UMS for Multi-turn Response Selection Implements the model described in the following paper Do Response Selection Models Really Know What's Next? Utte

Taesun Whang 47 Nov 22, 2022
This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

FlatGCN This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation, submitted to ICASSP2022). Req

Dreamer 2 Aug 09, 2022