This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

Last update: Dec 24, 2021

Related tags

Overview

vision-transformer-from-scratch

This repository includes several kinds of vision transformers from scratch so that one beginner can understand the theory of vision transformer easily. The basic transformer,the linformer transformer and the swin transformer are all trained and tested.

Requirements: PyTorch (>= 1.6.0); Python 3.6.9; Numpy (1.18.2); OpenCV ; Linformer;

Train the model: python main_train.py; In the main_train.py the basic transformer and the linformer can be selected.

Test the model: python test.py; In the main_train.py the basic transformer and the linformer can be selected.

The theory of vision transformer can reference the following document: https://towardsdatascience.com/implementing-visualttransformer-in-pytorch-184f9f16f632; https://www.kaggle.com/hannes82/vision-transformer-trained-from-scratch-pytorch;

Owner

GitHub Repository

Point detection through multi-instance deep heatmap regression for sutures in endoscopy

Suture detection PyTorch This repo contains the reference implementation of suture detection model in PyTorch for the paper Point detection through mu

3 Jul 16, 2022

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

2k Dec 31, 2022

This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

Related tags

Overview

vision-transformer-from-scratch

Owner

Point detection through multi-instance deep heatmap regression for sutures in endoscopy

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

AnimationKit: AI Upscaling & Interpolation using Real-ESRGAN+RIFE

Implementation of UNET architecture for Image Segmentation.

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

The Adapter-Bot: All-In-One Controllable Conversational Model

Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression"

Learning to Identify Top Elo Ratings with A Dueling Bandits Approach

An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.

Pyramid Scene Parsing Network, CVPR2017.

A Japanese Medical Information Extraction Toolkit

All materials of Cassandra Event, Udyam'22

A bare-bones Python library for quality diversity optimization.

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL.

A simple image/video to Desmos graph converter run locally

YOLOX + ROS(1, 2) object detection package