Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Last update: Dec 31, 2021

Related tags

Overview

LESA

Introduction

This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms. The code for image classification and object detection is based on axial-deeplab and mmdetection.

Citing LESA

If you find LESA is helpful in your project, please consider citing our paper.

@article{yang2021locally,
  title={Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms},
  author={Yang, Chenglin and Qiao, Siyuan and Kortylewski, Adam and Yuille, Alan},
  journal={arXiv preprint arXiv:2107.05637},
  year={2021}
}

Main Results on ImageNet

Please refer to LESA_classification for details.

Method	Model	Top-1 Acc.	Top-5 Acc.
LESA_ResNet50	Download	79.55	94.79
LESA_WRN50	Download	80.18	95.07

Main Results on COCO test-dev

Please refer to LESA_detection for details.

Method	Backbone	Pretrained	Model	Box AP	Mask AP
Mask-RCNN	LESA_ResNet50	Download	Download	44.2	39.6
HTC	LESA_WRN50	Download	Download	50.5	44.4

Credits

This project is based on axial-deeplab and mmdetection.

Relative position embedding is based on bottleneck-transformer-pytorch

ResNet is based on pytorch/vision. Classification helper functions are based on pytorch-classification.

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Related tags

Overview

LESA

Introduction

Citing LESA

Main Results on ImageNet

Main Results on COCO test-dev

Credits

Owner

Chenglin Yang

This is an official pytorch implementation of Fast Fourier Convolution.

ULMFiT for Genomic Sequence Data

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation (RA-L/ICRA 2020)

Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala, S. Krastanov, M. Eichenfield, and D. R. Englund, 2022

Square Root Bundle Adjustment for Large-Scale Reconstruction

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

code for "Self-supervised edge features for improved Graph Neural Network training",

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

BLEND: A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

Unofficial PyTorch Implementation of Multi-Singer

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.

Bayesian Inference Tools in Python

InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

这是一个利用facenet和retinaface实现人脸识别的库，可以进行在线的人脸识别。

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

On the Adversarial Robustness of Visual Transformer

GUPNet - Geometry Uncertainty Projection Network for Monocular 3D Object Detection