Implementation of Pix2Seq in PyTorch

Overview

pix2seq-pytorch

Implementation of Pix2Seq paper

Different from the paper

  • image input size 1280
  • bin size 1280
  • LambdaLR scheduler used instead of LinearLR
  • resnet50 instead of resnet50d or resnet101
  • etc.

Dataset

Download first coco2017 dataset and put it under dataset folder.

- dataset
  - annotations
    - instances_train2017.json
    - instances_val2017.json
  - train2017
    - 000000000000.jpg
    - ...
  - val2017

Train

python train.py --config configs/pix2seq.yaml
Owner
Tony Shin
University of Oxford MSc Computer Science Graduate
Tony Shin
Implementing Vision Transformer (ViT) in PyTorch

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

2 Dec 24, 2021
Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Talk-to-Edit (ICCV2021) This repository contains the implementation of the following paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog Yumin

Yuming Jiang 221 Jan 07, 2023
Vehicles Counting using YOLOv4 + DeepSORT + Flask + Ngrok

A project for counting vehicles using YOLOv4 + DeepSORT + Flask + Ngrok

Duong Tran Thanh 37 Dec 16, 2022
Code for the Paper: Alexandra Lindt and Emiel Hoogeboom.

Discrete Denoising Flows This repository contains the code for the experiments presented in the paper Discrete Denoising Flows [1]. To give a short ov

Alexandra Lindt 3 Oct 09, 2022
A simplistic and efficient pure-python neural network library from Phys Whiz with CPU and GPU support.

A simplistic and efficient pure-python neural network library from Phys Whiz with CPU and GPU support.

Manas Sharma 19 Feb 28, 2022
Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

Aktham Momani 2 Nov 07, 2022
Gesture recognition on Event Data

Event based Gesture Recognition Gesture recognition on Event Data usually involv

2 Feb 14, 2022
Federated Learning - Including common test models for federated learning, like CNN, Resnet18 and lstm, controlled by different parser

Federated_Learning 💻 This projest include common test models for federated lear

TianyuQi 10 Dec 11, 2022
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022
A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL

🌟 HNSW + PostgreSQL Indexer HNSWPostgreSQLIndexer Jina is a production-ready, scalable Indexer for the Jina neural search framework. It combines the

Jina AI 25 Oct 14, 2022
Sign Language Transformers (CVPR'20)

Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans

Necati Cihan Camgoz 164 Dec 30, 2022
Source code for our paper "Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures"

Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures Code for the Multiplex Molecular Graph Neural Network (M

shzhang 59 Dec 10, 2022
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier

LSTMs for Human Activity Recognition Human Activity Recognition (HAR) using smartphones dataset and an LSTM RNN. Classifying the type of movement amon

Guillaume Chevalier 3.1k Dec 30, 2022
When in Doubt: Improving Classification Performance with Alternating Normalization

When in Doubt: Improving Classification Performance with Alternating Normalization Findings of EMNLP 2021 Menglin Jia, Austin Reiter, Ser-Nam Lim, Yoa

Menglin Jia 13 Nov 06, 2022
gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks.

gym-anm is a framework for designing reinforcement learning (RL) environments that model Active Network Management (ANM) tasks in electricity distribution networks. It is built on top of the OpenAI G

Robin Henry 99 Dec 12, 2022
A clean and extensible PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

A clean and extensible PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners A PyTorch re-implementation of Mask Autoencoder trai

Tianyu Hua 23 Dec 13, 2022
Python package to add text to images, textures and different backgrounds

nider Python package for text images generation and watermarking Free software: MIT license Documentation: https://nider.readthedocs.io. nider is an a

Vladyslav Ovchynnykov 131 Dec 30, 2022
Robotics with GPU computing

Robotics with GPU computing Cupoch is a library that implements rapid 3D data processing for robotics using CUDA. The goal of this library is to imple

Shirokuma 625 Jan 07, 2023
Pixel-Perfect Structure-from-Motion with Featuremetric Refinement (ICCV 2021, Oral)

Pixel-Perfect Structure-from-Motion (ICCV 2021 Oral) We introduce a framework that improves the accuracy of Structure-from-Motion by refining keypoint

Computer Vision and Geometry Lab 831 Dec 29, 2022
PyTorch implementation for "HyperSPNs: Compact and Expressive Probabilistic Circuits", NeurIPS 2021

HyperSPN This repository contains code for the paper: HyperSPNs: Compact and Expressive Probabilistic Circuits "HyperSPNs: Compact and Expressive Prob

8 Nov 08, 2022