This repository contains demos I made with the Transformers library by HuggingFace.

Overview

Transformers-Tutorials

Hi there!

This repository contains demos I made with the Transformers library by 🤗 HuggingFace. Currently, all of them are implemented in PyTorch.

NOTE: if you are not familiar with HuggingFace and/or Transformers, I highly recommend to check out our free course, which introduces you to several Transformer architectures (such as BERT, GPT-2, T5, BART, etc.), as well as an overview of the HuggingFace libraries, including Transformers, Tokenizers, Datasets, Accelerate and the hub.

Currently, it contains the following demos:

  • BERT (paper):
    • fine-tuning BertForTokenClassification on a named entity recognition (NER) dataset. Open In Colab
    • fine-tuning BertForSequenceClassification for multi-label text classification. Open In Colab
  • CANINE (paper):
    • fine-tuning CanineForSequenceClassification on IMDb Open In Colab
  • DETR (paper):
    • performing inference with DetrForObjectDetection Open In Colab
    • fine-tuning DetrForObjectDetection on a custom object detection dataset Open In Colab
    • evaluating DetrForObjectDetection on the COCO detection 2017 validation set Open In Colab
    • performing inference with DetrForSegmentation Open In Colab
    • fine-tuning DetrForSegmentation on COCO panoptic 2017 Open In Colab
  • GPT-J-6B (repository):
    • performing inference with GPTJForCausalLM to illustrate few-shot learning and code generation Open In Colab
  • ImageGPT (blog post):
    • (un)conditional image generation with ImageGPTForCausalLM Open In Colab
    • linear probing with ImageGPT Open In Colab
  • LayoutLM (paper):
    • fine-tuning LayoutLMForTokenClassification on the FUNSD dataset Open In Colab
    • fine-tuning LayoutLMForSequenceClassification on the RVL-CDIP dataset Open In Colab
    • adding image embeddings to LayoutLM during fine-tuning on the FUNSD dataset Open In Colab
  • LayoutLMv2 (paper):
    • fine-tuning LayoutLMv2ForSequenceClassification on RVL-CDIP Open In Colab
    • fine-tuning LayoutLMv2ForTokenClassification on FUNSD Open In Colab
    • fine-tuning LayoutLMv2ForTokenClassification on FUNSD using the 🤗 Trainer Open In Colab
    • performing inference with LayoutLMv2ForTokenClassification on FUNSD Open In Colab
    • true inference with LayoutLMv2ForTokenClassification (when no labels are available) + Gradio demo Open In Colab
    • fine-tuning LayoutLMv2ForTokenClassification on CORD Open In Colab
    • fine-tuning LayoutLMv2ForQuestionAnswering on DOCVQA Open In Colab
  • LUKE (paper):
    • fine-tuning LukeForEntityPairClassification on a custom relation extraction dataset using PyTorch Lightning Open In Colab
  • SegFormer (paper):
    • performing inference with SegformerForSemanticSegmentation Open In Colab
    • fine-tuning SegformerForSemanticSegmentation on custom data using native PyTorch Open In Colab
  • Perceiver IO (paper):
    • showcasing masked language modeling and image classification with the Perceiver Open In Colab
    • fine-tuning the Perceiver for image classification Open In Colab
    • fine-tuning the Perceiver for text classification Open In Colab
    • predicting optical flow between a pair of images with PerceiverForOpticalFlowOpen In Colab
    • auto-encoding a video (images, audio, labels) with PerceiverForMultimodalAutoencoding Open In Colab
  • T5 (paper):
    • fine-tuning T5ForConditionalGeneration on a Dutch summarization dataset on TPU using HuggingFace Accelerate Open In Colab
    • fine-tuning T5ForConditionalGeneration (CodeT5) for Ruby code summarization using PyTorch Lightning Open In Colab
  • TAPAS (paper):
  • TrOCR (paper):
    • performing inference with TrOCR to illustrate optical character recognition with Transformers, as well as making a Gradio demo Open In Colab
    • fine-tuning TrOCR on the IAM dataset using the Seq2SeqTrainer Open In Colab
    • fine-tuning TrOCR on the IAM dataset using native PyTorch Open In Colab
    • evaluating TrOCR on the IAM test set Open In Colab
  • Vision Transformer (paper):
    • performing inference with ViTForImageClassification Open In Colab
    • fine-tuning ViTForImageClassification on CIFAR-10 using PyTorch Lightning Open In Colab
    • fine-tuning ViTForImageClassification on CIFAR-10 using the 🤗 Trainer Open In Colab

... more to come! 🤗

If you have any questions regarding these demos, feel free to open an issue on this repository.

Btw, I was also the main contributor to add the following algorithms to the library:

  • TAbular PArSing (TAPAS) by Google AI
  • Vision Transformer (ViT) by Google AI
  • Data-efficient Image Transformers (DeiT) by Facebook AI
  • LUKE by Studio Ousia
  • DEtection TRansformers (DETR) by Facebook AI
  • CANINE by Google AI
  • BEiT by Microsoft Research
  • LayoutLMv2 (and LayoutXLM) by Microsoft Research
  • TrOCR by Microsoft Research
  • SegFormer by NVIDIA
  • ImageGPT by OpenAI
  • Perceiver by Deepmind

All of them were an incredible learning experience. I can recommend anyone to contribute an AI algorithm to the library!

Data preprocessing

Regarding preparing your data for a PyTorch model, there are a few options:

  • a native PyTorch dataset + dataloader. This is the standard way to prepare data for a PyTorch model, namely by subclassing torch.utils.data.Dataset, and then a creating corresponding DataLoader (which is a Python generator that allows to loop over the items of a dataset). When subclassing the Dataset class, one needs to implement 3 methods: __init__, __len__ (which returns the number of examples of the dataset) and __getitem__ (which returns an example of the dataset, given an integer index). Here's an example of creating a basic text classification dataset (assuming one has a CSV that contains 2 columns, namely "text" and "label"):
from torch.utils.data import Dataset

class CustomTrainDataset(Dataset):
    def __init__(self, df, tokenizer):
        self.df = df
        self.tokenizer = tokenizer

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):
        # get item
        item = df.iloc[idx]
        text = item['text']
        label = item['label']
        # encode text
        encoding = self.tokenizer(text, padding="max_length", max_length=128, truncation=True, return_tensors="pt")
        # remove batch dimension which the tokenizer automatically adds
        encoding = {k:v.squeeze() for k,v in encoding.items()}
        # add label
        encoding["label"] = torch.tensor(label)
        
        return encoding

Instantiating the dataset then happens as follows:

from transformers import BertTokenizer
import pandas as pd

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
df = pd.read_csv("path_to_your_csv")

train_dataset = CustomTrainDataset(df=df tokenizer=tokenizer)

Accessing the first example of the dataset can then be done as follows:

encoding = train_dataset[0]

In practice, one creates a corresponding DataLoader, that allows to get batches from the dataset:

from torch.utils.data import DataLoader

train_dataloader = DataLoader(train_dataset, batch_size=4, shuffle=True)

I often check whether the data is created correctly by fetching the first batch from the data loader, and then printing out the shapes of the tensors, decoding the input_ids back to text, etc.

batch = next(iter(train_dataloader))
for k,v in batch.items():
    print(k, v.shape)
# decode the input_ids of the first example of the batch
print(tokenizer.decode(batch['input_ids'][0].tolist())
  • HuggingFace Datasets. Datasets is a library by HuggingFace that allows to easily load and process data in a very fast and memory-efficient way. It is backed by Apache Arrow, and has cool features such as memory-mapping, which allow you to only load data into RAM when it is required. It only has deep interoperability with the HuggingFace hub, allowing to easily load well-known datasets as well as share your own with the community.

Loading a custom dataset as a Dataset object can be done as follows (you can install datasets using pip install datasets):

from datasets import load_dataset

dataset = load_dataset('csv', data_files={'train': ['my_train_file_1.csv', 'my_train_file_2.csv'] 'test': 'my_test_file.csv'})

Here I'm loading local CSV files, but there are other formats supported (including JSON, Parquet, txt) as well as loading data from a local Pandas dataframe or dictionary for instance. You can check out the docs for all details.

Training frameworks

Regarding fine-tuning Transformer models (or more generally, PyTorch models), there are a few options:

  • using native PyTorch. This is the most basic way to train a model, and requires the user to manually write the training loop. The advantage is that this is very easy to debug. The disadvantage is that one needs to implement training him/herself, such as setting the model in the appropriate mode (model.train()/model.eval()), handle device placement (model.to(device)), etc. A typical training loop in PyTorch looks as follows (inspired by this great PyTorch intro tutorial):
import torch

model = ...

# I almost always use a learning rate of 5e-5 when fine-tuning Transformer based models
optimizer = torch.optim.Adam(model.parameters(), lr=5-e5)

# put model on GPU, if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

for epoch in range(epochs):
    model.train()
    train_loss = 0.0
    for batch in train_dataloader:
        # put batch on device
        batch = {k:v.to(device) for k,v in batch.items()}
        
        # forward pass
        outputs = model(**batch)
        loss = outputs.loss
        
        train_loss += loss.item()
        
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

    print("Loss after epoch {epoch}:", train_loss/len(train_dataloader))
    
    model.eval()
    val_loss = 0.0
    with torch.no_grad():
        for batch in eval_dataloader:
            # put batch on device
            batch = {k:v.to(device) for k,v in batch.items()}
            
            # forward pass
            outputs = model(**batch)
            loss = outputs.logits
            
            val_loss += loss.item()
                  
    print("Validation loss after epoch {epoch}:", val_loss/len(eval_dataloader))
  • PyTorch Lightning (PL). PyTorch Lightning is a framework that automates the training loop written above, by abstracting it away in a Trainer object. Users don't need to write the training loop themselves anymore, instead they can just do trainer = Trainer() and then trainer.fit(model). The advantage is that you can start training models very quickly (hence the name lightning), as all training-related code is handled by the Trainer object. The disadvantage is that it may be more difficult to debug your model, as the training and evaluation is now abstracted away.
  • HuggingFace Trainer. The HuggingFace Trainer API can be seen as a framework similar to PyTorch Lightning in the sense that it also abstracts the training away using a Trainer object. However, contrary to PyTorch Lightning, it is not meant not be a general framework. Rather, it is made especially for fine-tuning Transformer-based models available in the HuggingFace Transformers library. The Trainer also has an extension called Seq2SeqTrainer for encoder-decoder models, such as BART, T5 and the EncoderDecoderModel classes. Note that all PyTorch example scripts of the Transformers library make use of the Trainer.
  • HuggingFace Accelerate: Accelerate is a new project, that is made for people who still want to write their own training loop (as shown above), but would like to make it work automatically irregardless of the hardware (i.e. multiple GPUs, TPU pods, mixed precision, etc.).
Owner
ML @HuggingFace. Interested in deep learning, NLP. Contributed TAPAS, ViT, DeiT, LUKE, DETR, CANINE to HuggingFace Transformers
TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification [NeurIPS 2021] Abstract Multiple instance learn

132 Dec 30, 2022
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

WECHSEL Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. arXiv: https://arx

Institute of Computational Perception 45 Dec 29, 2022
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

348 Jan 07, 2023
A containerized REST API around OpenAI's CLIP model.

OpenAI's CLIP — REST API This is a container wrapping OpenAI's CLIP model in a RESTful interface. Running the container locally First, build the conta

Santiago Valdarrama 48 Nov 06, 2022
Image inpainting using Gaussian Mixture Models

dmfa_inpainting Source code for: MisConv: Convolutional Neural Networks for Missing Data (to be published at WACV 2022) Estimating conditional density

Marcin Przewięźlikowski 8 Oct 09, 2022
Feedback is important: response-aware feedback mechanism for background based conversation

RFM The code for the paper: "Feedback is important: response-aware feedback mechanism for background based conversation." Requirements python 3.7 pyto

Jiatao Chen 2 Sep 29, 2022
PyTorch Implementations for DeeplabV3 and PSPNet

Pytorch-segmentation-toolbox DOC Pytorch code for semantic segmentation. This is a minimal code to run PSPnet and Deeplabv3 on Cityscape dataset. Shor

Zilong Huang 746 Dec 15, 2022
Deep Reinforcement Learning based autonomous navigation for quadcopters using PPO algorithm.

PPO-based Autonomous Navigation for Quadcopters This repository contains an implementation of Proximal Policy Optimization (PPO) for autonomous naviga

Bilal Kabas 16 Nov 11, 2022
The official implementation of Equalization Loss for Long-Tailed Object Recognition (CVPR 2020) based on Detectron2

Equalization Loss for Long-Tailed Object Recognition Jingru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin, Junjie Yan ⚠️ We re

Jingru Tan 197 Dec 25, 2022
Code release for NeuS

NeuS We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inpu

Peng Wang 813 Jan 04, 2023
PaSST: Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022
Construct a neural network frame by Numpy

本项目的CSDN博客链接:https://blog.csdn.net/weixin_41578567/article/details/111482022 1. 概览 本项目主要用于神经网络的学习,通过基于numpy的实现,了解神经网络底层前向传播、反向传播以及各类优化器的原理。 该项目目前已实现的功

24 Jan 22, 2022
Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation By Yi Li, Zhanghui Kuang, Liyang Liu, Yimin Chen, Wayne Zhang SenseTime, Tsinghua Unive

33 Oct 14, 2022
This is the winning solution of the Endocv-2021 grand challange.

Endocv2021-winner [Paper] This is the winning solution of the Endocv-2021 grand challange. Dependencies pytorch # tested with 1.7 and 1.8 torchvision

Vajira Thambawita 14 Dec 03, 2022
[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma This is the offi

Kaidi Cao 528 Jan 01, 2023
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Trevor Ablett*, Bryan Chan*,

STARS Laboratory 8 Sep 14, 2022
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022
Code for the paper "Location-aware Single Image Reflection Removal"

Location-aware Single Image Reflection Removal The shown images are provided by the datasets from IBCLN, ERRNet, SIR2 and the Internet images. The cod

72 Dec 08, 2022
Robustness via Cross-Domain Ensembles

Robustness via Cross-Domain Ensembles [ICCV 2021, Oral] This repository contains tools for training and evaluating: Pretrained models Demo code Traini

Visual Intelligence & Learning Lab, Swiss Federal Institute of Technology (EPFL) 27 Dec 23, 2022
Neural Oblivious Decision Ensembles

Neural Oblivious Decision Ensembles A supplementary code for anonymous ICLR 2020 submission. What does it do? It learns deep ensembles of oblivious di

25 Sep 21, 2022