PyTorch implementation of "LayoutTransformer: Layout Generation and Completion with Self-attention"

Overview

LayoutTransformer

arXiv | BibTeX | Project Page

This repo contains code for single GPU training of LayoutTransformer from LayoutTransformer: Layout Generation and Completion with Self-attention. This code was rewritten from scratch using a cleaner GPT codebase. Some of the details such as training hyperparameters might differ from the arxiv version of the paper.

teaser!

How To Use This Code

Start a new conda environment

conda env create -f environment.yml
conda activate layout

or update an existing environment

conda env update -f environment.yml --prune

Logging with wandb

In order to log experiments to wandb, we use wandb's API keys that can be found here https://wandb.ai/settings. Copy your key and store them in an environment variable using

export WANDB_API_KEY=
   

   

Alternately, you can also login using wandb login.

Datasets

COCO Bounding Boxes

See the instructions to obtain the dataset here.

PubLayNet Document Layouts

See the instructions to obtain the dataset here.

LayoutVAE

Reimplementation of LayoutVAE is here. Code contributed primarily by Justin.

cd layout_vae

# Train the CountVAE model
python train_counts.py \
    --exp count_coco_instances \
    --train_json /path/to/coco/annotations/instances_train2017.json \
    --val_json /path/to/coco/annotations/instances_val2017.json \
    --epochs 50

# Train the BoxVAE model
python train_counts.py \
    --exp box_coco_instances \
    --train_json /path/to/coco/annotations/instances_train2017.json \
    --val_json /path/to/coco/annotations/instances_val2017.json \
    --epochs 50

LayoutTransformer

Rewritten from scratch using a cleaner GPT codebase. Some of the details such as training hyperparameters might differ from the arxiv version.

# Training on MNIST layouts
python main.py \
    --data_dir /path/to/mnist \
    --threshold 1 --exp mnist_threshold_1
    
# Training on COCO bounding boxes or PubLayNet
python main.py \
    --train_json /path/to/annotations/train.json \
    --val_json /path/to/annotations/val.json \
    --exp publaynet

BibTeX

If you use this code, please cite

@inproceedings{gupta2021layouttransformer,
  title={LayoutTransformer: Layout Generation and Completion with Self-attention},
  author={Gupta, Kamal and Lazarow, Justin and Achille, Alessandro and Davis, Larry S and Mahadevan, Vijay and Shrivastava, Abhinav},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={1004--1014},
  year={2021}
}
}

Acknowledgments

We would like to thank several public repos

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

136 Dec 12, 2022
render sprites into your desktop environment as shaped windows using GTK

spritegtk render static or animated sprites into your desktop environment as dynamic shaped windows using GTK requires pycairo and PYGobject: pip inst

hermit 20 Oct 27, 2022
Lipstick ain't enough: Beyond Color-Matching for In-the-Wild Makeup Transfer (CVPR 2021)

Table of Content Introduction Datasets Getting Started Requirements Usage Example Training & Evaluation CPM: Color-Pattern Makeup Transfer CPM is a ho

VinAI Research 248 Dec 13, 2022
A DeepStack custom model for detecting common objects in dark/night images and videos.

DeepStack_ExDark This repository provides a custom DeepStack model that has been trained and can be used for creating a new object detection API for d

MOSES OLAFENWA 98 Dec 24, 2022
Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.

As much as this series is to educate aspiring computer programmers and data scientists of all ages and all backgrounds, it is also a reminder to mysel

EKA foundation 758 Dec 25, 2022
10x faster matrix and vector operations

Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations. If yo

2.3k Jan 09, 2023
Adversarially Learned Inference

Adversarially Learned Inference Code for the Adversarially Learned Inference paper. Compiling the paper locally From the repo's root directory, $ cd p

Mohamed Ishmael Belghazi 308 Sep 24, 2022
A Dataset for Direct Quotation Extraction and Attribution in News Articles.

DirectQuote - A Dataset for Direct Quotation Extraction and Attribution in News Articles DirectQuote is a corpus containing 19,760 paragraphs and 10,3

THUNLP-MT 9 Sep 23, 2022
some classic model used to segment the medical images like CT、X-ray and so on

github_project This is a project for medical image segmentation. This project includes common medical image segmentation models such as U-net, FCN, De

2 Mar 30, 2022
Official DGL implementation of "Rethinking High-order Graph Convolutional Networks"

SE Aggregation This is the implementation for Rethinking High-order Graph Convolutional Networks. Here we show the codes for citation networks as an e

Tianqi Zhang (张天启) 32 Jul 19, 2022
Galileo library for large scale graph training by JD

近年来,图计算在搜索、推荐和风控等场景中获得显著的效果,但也面临超大规模异构图训练,与现有的深度学习框架Tensorflow和PyTorch结合等难题。 Galileo(伽利略)是一个图深度学习框架,具备超大规模、易使用、易扩展、高性能、双后端等优点,旨在解决超大规模图算法在工业级场景的落地难题,提

JD Galileo Team 128 Nov 29, 2022
Monify: an Expense tracker Program implemented in a Graphical User Interface that allows users to keep track of their expenses

💳 MONIFY (EXPENSE TRACKER PRO) 💳 Description Monify is an Expense tracker Program implemented in a Graphical User Interface allows users to add inco

Moyosore Weke 1 Dec 14, 2021
NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

NitroFE is a Python feature engineering engine which provides a variety of modules designed to internally save past dependent values for providing continuous calculation.

100 Sep 28, 2022
Machine Unlearning with SISA

Machine Unlearning with SISA Lucas Bourtoule, Varun Chandrasekaran, Christopher Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, N

CleverHans Lab 70 Jan 01, 2023
DeLighT: Very Deep and Light-Weight Transformers

DeLighT: Very Deep and Light-weight Transformers This repository contains the source code of our work on building efficient sequence models: DeFINE (I

Sachin Mehta 440 Dec 18, 2022
GLANet - The code for Global and Local Alignment Networks for Unpaired Image-to-Image Translation arxiv

GLANet The code for Global and Local Alignment Networks for Unpaired Image-to-Image Translation arxiv Framework: visualization results: Getting Starte

stanley 29 Dec 14, 2022
[CVPR'21] Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration This repository contains the implementation of our paper Locally Aware Pi

sfwang 70 Dec 19, 2022
Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

English: README-EN.md VRCWatch VRCWatch は、VRChat 内のアバター向けに現在時刻を送信するためのプログラムです。 使

Kosaki Mezumona 17 Nov 30, 2022
Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

auto-self-checker 자동으로 자가진단 해주는 프로그램(python 필요) 중요 이 프로그램이 실행될때에는 절대로 마우스포인터를 움직이거나 키보드를 건드리면 안된다(화면인식, 마우스포인터로 직접 클릭) 사용법 프로그램을 구동할 폴더 내의 cmd창에서 pip

1 Dec 30, 2021
Pose Detection and Machine Learning for real-time body posture analysis during exercise to provide audiovisual feedback on improvement of form.

Posture: Pose Tracking and Machine Learning for prescribing corrective suggestions to improve posture and form while exercising. This repository conta

Pratham Mehta 10 Nov 11, 2022