Summary of related papers on visual attention

Overview

This repo is built for paper: Attention Mechanisms in Computer Vision: A Survey paper

image

πŸ”₯ (citations > 200)

  • TODO : Code about different attention mechanisms will come soon.
  • TODO : Code link will come soon.
  • TODO : collect more related papers. Contributions are welcome.

Channel attention

  • Squeeze-and-Excitation Networks(CVPR2018) pdf, (PAMI2019 version) pdf πŸ”₯
  • Image superresolution using very deep residual channel attention networks(ECCV2018) pdf πŸ”₯
  • Context encoding for semantic segmentation(CVPR2018) pdf πŸ”₯
  • Spatio-temporal channel correlation networks for action classification(ECCV2018) pdf
  • Global second-order pooling convolutional networks(CVPR2019) pdf
  • Srm : A style-based recalibration module for convolutional neural networks(ICCV2019) pdf
  • You look twice: Gaternet for dynamic filter selection in cnns(CVPR2019) pdf
  • Second-order attention network for single image super-resolution(CVPR2019) pdf πŸ”₯
  • Spsequencenet: Semantic segmentation network on 4d point clouds(CVPR2020) pdf
  • Ecanet: Efficient channel attention for deep convolutional neural networks (CVPR2020) pdf πŸ”₯
  • Gated channel transformation for visual recognition(CVPR2020) pdf
  • Fcanet: Frequency channel attention networks(ICCV2021) pdf

Spatial attention

  • Recurrent models of visual attention(NeurIPS2014), pdf πŸ”₯
  • Show, attend and tell: Neural image caption generation with visual attention(PMLR2015) pdf πŸ”₯
  • Draw: A recurrent neural network for image generation(ICML2015) pdf πŸ”₯
  • Spatial transformer networks(NeurIPS2015) pdf πŸ”₯
  • Multiple object recognition with visual attention(ICLR2015) pdf πŸ”₯
  • Action recognition using visual attention(arXiv2015) pdf πŸ”₯
  • Videolstm convolves, attends and flows for action recognition(arXiv2016) pdf πŸ”₯
  • Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition(CVPR2017) pdf πŸ”₯
  • Learning multi-attention convolutional neural network for fine-grained image recognition(ICCV2017) pdf πŸ”₯
  • Diversified visual attention networks for fine-grained object classification(TMM2017) pdf πŸ”₯
  • Attentional pooling for action recognition(NeurIPS2017) pdf πŸ”₯
  • Non-local neural networks(CVPR2018) pdf πŸ”₯
  • Attentional shapecontextnet for point cloud recognition(CVPR2018) pdf
  • Relation networks for object detection(CVPR2018) pdf πŸ”₯
  • a2-nets: Double attention networks(NeurIPS2018) pdf πŸ”₯
  • Attention-aware compositional network for person re-identification(CVPR2018) pdf πŸ”₯
  • Tell me where to look: Guided attention inference network(CVPR2018) pdf πŸ”₯
  • Pedestrian alignment network for large-scale person re-identification(TCSVT2018) pdf πŸ”₯
  • Learn to pay attention(ICLR2018) pdf πŸ”₯
  • Attention U-Net: Learning Where to Look for the Pancreas(MIDL2018) pdf πŸ”₯
  • Psanet: Point-wise spatial attention network for scene parsing(ECCV2018) pdf πŸ”₯
  • Self attention generative adversarial networks(ICML2019) pdf πŸ”₯
  • Attentional pointnet for 3d-object detection in point clouds(CVPRW2019) pdf
  • Co-occurrent features in semantic segmentation(CVPR2019) pdf
  • Attention augmented convolutional networks(ICCV2019) pdf πŸ”₯
  • Local relation networks for image recognition(ICCV2019) pdf
  • Latentgnn: Learning efficient nonlocal relations for visual recognition(ICML2019) pdf
  • Graph-based global reasoning networks(CVPR2019) pdf πŸ”₯
  • Gcnet: Non-local networks meet squeeze-excitation networks and beyond(ICCVW2019) pdf πŸ”₯
  • Asymmetric non-local neural networks for semantic segmentation(ICCV2019) pdf πŸ”₯
  • Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition(CVPR2019) pdf
  • Second-order non-local attention networks for person re-identification(ICCV2019) pdf πŸ”₯
  • End-to-end comparative attention networks for person re-identification(ICCV2019) pdf πŸ”₯
  • Modeling point clouds with self-attention and gumbel subset sampling(CVPR2019) pdf
  • Diagnose like a radiologist: Attention guided convolutional neural network for thorax disease classification(arXiv 2019) pdf
  • L2g autoencoder: Understanding point clouds by local-to-global reconstruction with hierarchical self-attention(arXiv 2019) pdf
  • Generative pretraining from pixels(PMLR2020) pdf
  • Exploring self-attention for image recognition(CVPR2020) pdf
  • Cf-sis: Semantic-instance segmentation of 3d point clouds by context fusion with self attention(MM20) pdf
  • Disentangled non-local neural networks(ECCV2020) pdf
  • Relation-aware global attention for person re-identification(CVPR2020) pdf
  • Segmentation transformer: Object-contextual representations for semantic segmentation(ECCV2020) pdf πŸ”₯
  • Spatial pyramid based graph reasoning for semantic segmentation(CVPR2020) pdf
  • Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation(CVPR2020) pdf
  • End-to-end object detection with transformers(ECCV2020) pdf πŸ”₯
  • Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling(CVPR2020) pdf
  • Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers(CVPR2021) pdf
  • An image is worth 16x16 words: Transformers for image recognition at scale(ICLR2021) pdf πŸ”₯
  • An empirical study of training selfsupervised vision transformers(CVPR2021) pdf
  • Ocnet: Object context network for scene parsing(IJCV 2021) pdf πŸ”₯
  • Point transformer(ICCV 2021) pdf
  • PCT: Point Cloud Transformer (CVMJ 2021) pdf
  • Pre-trained image processing transformer(CVPR 2021) pdf
  • An empirical study of training self-supervised vision transformers(ICCV 2021) pdf
  • Segformer: Simple and efficient design for semantic segmentation with transformers(arxiv 2021) pdf
  • Beit: Bert pre-training of image transformers(arxiv 2021) pdf
  • Beyond selfattention: External attention using two linear layers for visual tasks(arxiv 2021) pdf
  • Query2label: A simple transformer way to multi-label classification(arxiv 2021) pdf
  • Transformer in transformer(arxiv 2021) pdf

Temporal attention

  • Jointly attentive spatial-temporal pooling networks for video-based person re-identification (ICCV 2017) pdf πŸ”₯
  • Video person reidentification with competitive snippet-similarity aggregation and co-attentive snippet embedding(CVPR 2018) pdf
  • Scan: Self-and-collaborative attention network for video person re-identification (TIP 2019) pdf

Branch attention

  • Training very deep networks, (NeurIPS 2015) pdf πŸ”₯
  • Selective kernel networks,(CVPR 2019) pdf πŸ”₯
  • CondConv: Conditionally Parameterized Convolutions for Efficient Inference (NeurIPS 2019) pdf
  • Dynamic convolution: Attention over convolution kernels (CVPR 2020) pdf
  • ResNest: Split-attention networks (arXiv 2020) pdf πŸ”₯

ChannelSpatial attention

  • Residual attention network for image classification (CVPR 2017) pdf πŸ”₯
  • SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning,(CVPR 2017) pdf πŸ”₯
  • CBAM: convolutional block attention module, (ECCV 2018) pdf πŸ”₯
  • Harmonious attention network for person re-identification (CVPR 2018) pdf πŸ”₯
  • Recalibrating fully convolutional networks with spatial and channel β€œsqueeze and excitation” blocks (TMI 2018) pdf
  • Mancs: A multi-task attentional network with curriculum sampling for person re-identification (ECCV 2018) pdf πŸ”₯
  • Bam: Bottleneck attention module(BMVC 2018) pdf πŸ”₯
  • Pvnet: A joint convolutional network of point cloud and multi-view for 3d shape recognition (ACM MM 2018) pdf
  • Learning what and where to attend,(ICLR 2019) pdf
  • Dual attention network for scene segmentation (CVPR 2019) pdf πŸ”₯
  • Abd-net: Attentive but diverse person re-identification (ICCV 2019) pdf
  • Mixed high-order attention network for person re-identification (ICCV 2019) pdf
  • Mlcvnet: Multi-level context votenet for 3d object detection (CVPR 2020) pdf
  • Improving convolutional networks with self-calibrated convolutions (CVPR 2020) pdf
  • Relation-aware global attention for person re-identification (CVPR 2020) pdf
  • Strip Pooling: Rethinking spatial pooling for scene parsing (CVPR 2020) pdf
  • Rotate to attend: Convolutional triplet attention module, (WACV 2021) pdf
  • Coordinate attention for efficient mobile network design (CVPR 2021) pdf
  • Simam: A simple, parameter-free attention module for convolutional neural networks (ICML 2021) pdf

SpatialTemporal attention

  • An end-to-end spatio-temporal attention model for human action recognition from skeleton data(AAAI 2017) pdf πŸ”₯
  • Diversity regularized spatiotemporal attention for video-based person re-identification (ArXiv 2018) πŸ”₯
  • Interpretable spatio-temporal attention for video action recognition (ICCVW 2019) pdf
  • Hierarchical lstms with adaptive attention for visual captioning, (TPAMI 2020) pdf
  • Stat: Spatial-temporal attention mechanism for video captioning, (TMM 2020) pdf_link
  • Gta: Global temporal attention for video action understanding (ArXiv 2020) pdf
  • Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification (CVPR 2020) pdf
  • Read: Reciprocal attention discriminator for image-to-video re-identification, (ECCV 2020) pdf
  • Decoupled spatial-temporal transformer for video inpainting (ArXiv 2021) pdf
Owner
MenghaoGuo
Second-year Ph.D candidate at G2 group, Tsinghua University.
MenghaoGuo
Monk is a low code Deep Learning tool and a unified wrapper for Computer Vision.

Monk - A computer vision toolkit for everyone Why use Monk Issue: Want to begin learning computer vision Solution: Start with Monk's hands-on study ro

Tessellate Imaging 507 Dec 04, 2022
Linear algebra python - Number of operations and problems in Linear Algebra and Numerical Linear Algebra

Linear algebra in python Number of operations and problems in Linear Algebra and

Alireza 5 Oct 09, 2022
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 29 Jan 08, 2023
An AutoML Library made with Optuna and PyTorch Lightning

An AutoML Library made with Optuna and PyTorch Lightning Installation Recommended pip install -U gradsflow From source pip install git+https://github.

GradsFlow 294 Dec 17, 2022
Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Towards Diverse Paragraph Captioning for Untrimmed Videos This repository contains PyTorch implementation of our paper Towards Diverse Paragraph Capti

Yuqing Song 61 Oct 11, 2022
RLBot Python bindings for the Rust crate rl_ball_sym

RLBot Python bindings for rl_ball_sym 0.6 Prerequisites: Rust & Cargo Build Tools for Visual Studio RLBot - Verify that the file %localappdata%\RLBotG

Eric Veilleux 2 Nov 25, 2022
A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

Matthias A Lee 4.6k Jan 05, 2023
Over-the-Air Ensemble Inference with Model Privacy

Over-the-Air Ensemble Inference with Model Privacy This repository contains simulations for our private ensemble inference method. Installation Instal

Selim Firat Yilmaz 1 Jun 29, 2022
[arXiv] What-If Motion Prediction for Autonomous Driving β“πŸš—πŸ’¨

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

William Qi 96 Dec 29, 2022
Unoffical implementation about Image Super-Resolution via Iterative Refinement by Pytorch

Image Super-Resolution via Iterative Refinement Paper | Project Brief This is a unoffical implementation about Image Super-Resolution via Iterative Re

LiangWei Jiang 2.5k Jan 02, 2023
Python package for Bayesian Machine Learning with scikit-learn API

Python package for Bayesian Machine Learning with scikit-learn API Installing & Upgrading package pip install https://github.com/AmazaspShumik/sklearn

Amazasp Shaumyan 482 Jan 04, 2023
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees This repository is the official implementation of the empirica

Kuan-Lin (Jason) Chen 2 Oct 02, 2022
Rasterize with the least efforts for researchers.

utils3d Rasterize and do image-based 3D transforms with the least efforts for researchers. Based on numpy and OpenGL. It could be helpful when you wan

Ruicheng Wang 8 Dec 15, 2022
Its a Plant Leaf Disease Detection System based on Machine Learning.

My_Project_Code Its a Plant Leaf Disease Detection System based on Machine Learning. I have used Tomato Leaves Dataset from kaggle. This system detect

Sanskriti Sidola 3 Jun 15, 2022
Gif-caption - A straightforward GIF Captioner written in Python

Broksy's GIF Captioner Have you ever wanted to easily caption a GIF without havi

3 Apr 09, 2022
MultiLexNorm 2021 competition system from ÚFAL

ÚFAL at MultiLexNorm 2021: Improving Multilingual Lexical Normalization by Fine-tuning ByT5 David Samuel & Milan Straka Charles University Faculty of

ÚFAL 13 Jun 28, 2022
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022
Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

1.3k Dec 25, 2022
Air Quality Prediction Using LSTM

AirQualityPredictionUsingLSTM In this Repo, i present to you the winning solution of smart gujarat hackathon 2019 where the task was to predict the qu

Deepak Nandwani 2 Dec 13, 2022
Import Python modules from dicts and JSON formatted documents.

Paker Paker is module for importing Python packages/modules from dictionaries and JSON formatted documents. It was inspired by httpimporter. Important

Wojciech Wentland 1 Sep 07, 2022