A pre-trained model with multi-exit transformer architecture.

Overview

ElasticBERT

This repository contains finetuning code and checkpoints for ElasticBERT.

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

Xiangyang Liu, Tianxiang Sun, Junliang He, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu

Requirements

We recommend using Anaconda for setting up the environment of experiments:

conda create -n elasticbert python=3.8.8
conda activate elasticbert
conda install pytorch==1.8.1 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install -r requirements.txt

Pre-trained Models

We provide the pre-trained weights of ElasticBERT-BASE and ElasticBERT-LARGE, which can be directly used in Huggingface-Transformers.

  • ElasticBERT-BASE: 12 layers, 12 Heads and 768 Hidden Size.
  • ElasticBERT-LARGE: 24 layers, 16 Heads and 1024 Hidden Size.

The pre-trained weights can be downloaded here.

Model MODEL_NAME
ElasticBERT-BASE fnlp/elasticbert-base
ElasticBERT-LARGE fnlp/elasticbert-large

Downstream task datasets

The GLUE task datasets can be downloaded from the GLUE leaderboard

The ELUE task datasets can be downloaded from the ELUE leaderboard

Finetuning in static usage

We provide the finetuning code for both GLUE tasks and ELUE tasks in static usage on ElasticBERT.

For GLUE:

cd finetune-static
bash finetune_glue.sh

For ELUE:

cd finetune-static
bash finetune_elue.sh

Finetuning in dynamic usage

We provide finetuning code to apply two kind of early exiting methods on ElasticBERT.

For early exit using entropy criterion:

cd finetune-dynamic
bash finetune_elue_entropy.sh

For early exit using patience criterion:

cd finetune-dynamic
bash finetune_elue_patience.sh

Please see our paper for more details!

Contact

If you have any problems, raise an issue or contact Xiangyang Liu

Citation

If you find this repo helpful, we'd appreciate it a lot if you can cite the corresponding paper:

@article{liu2021elasticbert,
  author    = {Xiangyang Liu and
               Tianxiang Sun and
               Junliang He and
               Lingling Wu and
               Xinyu Zhang and
               Hao Jiang and
               Zhao Cao and
               Xuanjing Huang and
               Xipeng Qiu},
  title     = {Towards Efficient {NLP:} {A} Standard Evaluation and {A} Strong Baseline},
  journal   = {CoRR},
  volume    = {abs/2110.07038},
  year      = {2021},
  url       = {https://arxiv.org/abs/2110.07038},
  eprinttype = {arXiv},
  eprint    = {2110.07038},
  timestamp = {Fri, 22 Oct 2021 13:33:09 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2110-07038.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
Owner
fastNLP
由复旦大学的自然语言处理(NLP)团队发起的国产自然语言处理开源项目
fastNLP
Database Reasoning Over Text project for ACL paper

Database Reasoning over Text This repository contains the code for the Database Reasoning Over Text paper, to appear at ACL2021. Work is performed in

Facebook Research 320 Dec 12, 2022
ACL'2021: LM-BFF: Better Few-shot Fine-tuning of Language Models

LM-BFF (Better Few-shot Fine-tuning of Language Models) This is the implementation of the paper Making Pre-trained Language Models Better Few-shot Lea

Princeton Natural Language Processing 607 Jan 07, 2023
Code for "Searching for Efficient Multi-Stage Vision Transformers"

Searching for Efficient Multi-Stage Vision Transformers This repository contains the official Pytorch implementation of "Searching for Efficient Multi

Yi-Lun Liao 62 Oct 25, 2022
EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

EFENet EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation Code is a bit messy now. I woud clean up soon. For training the EF

Yaping Zhao 19 Nov 05, 2022
[CVPR'22] Official PyTorch Implementation of Collaborative Transformers for Grounded Situation Recognition

[CVPR'22] Collaborative Transformers for Grounded Situation Recognition Paper | Model Checkpoint This is the official PyTorch implementation of Collab

Junhyeong Cho 29 Dec 10, 2022
Inflated i3d network with inception backbone, weights transfered from tensorflow

I3D models transfered from Tensorflow to PyTorch This repo contains several scripts that allow to transfer the weights from the tensorflow implementat

Yana 479 Dec 08, 2022
Deep Reinforcement Learning based autonomous navigation for quadcopters using PPO algorithm.

PPO-based Autonomous Navigation for Quadcopters This repository contains an implementation of Proximal Policy Optimization (PPO) for autonomous naviga

Bilal Kabas 16 Nov 11, 2022
Trained on Simulated Data, Tested in the Real World

Trained on Simulated Data, Tested in the Real World

livox 43 Nov 18, 2022
Imaginaire - NVIDIA's Deep Imagination Team's PyTorch Library

Imaginaire Docs | License | Installation | Model Zoo Imaginaire is a pytorch library that contains optimized implementation of several image and video

NVIDIA Research Projects 3.6k Dec 29, 2022
Directed Greybox Fuzzing with AFL

AFLGo: Directed Greybox Fuzzing AFLGo is an extension of American Fuzzy Lop (AFL). Given a set of target locations (e.g., folder/file.c:582), AFLGo ge

380 Nov 24, 2022
Serverless proxy for Spark cluster

Hydrosphere Mist Hydrosphere Mist is a serverless proxy for Spark cluster. Mist provides a new functional programming framework and deployment model f

hydrosphere.io 317 Dec 01, 2022
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 749 Jan 04, 2023
Fusion-in-Decoder Distilling Knowledge from Reader to Retriever for Question Answering

This repository contains code for: Fusion-in-Decoder models Distilling Knowledge from Reader to Retriever Dependencies Python 3 PyTorch (currently tes

Meta Research 323 Dec 19, 2022
Code for Fold2Seq paper from ICML 2021

[ICML2021] Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design Environment file: environment.yml Data and Feat

International Business Machines 43 Dec 04, 2022
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
Automatic deep learning for image classification.

AutoDL AutoDL automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few line

wenqi 2 Oct 12, 2022
PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)

Hand Biomechanical Constraints Pytorch Unofficial PyTorch reimplementation of Hand-Biomechanical-Constraints (ECCV2020). This project reimplement foll

Hao Meng 59 Dec 20, 2022
Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras

Deep Learning Tutorial for Kaggle Ultrasound Nerve Segmentation competition, using Keras This tutorial shows how to use Keras library to build deep ne

Marko Jocić 922 Dec 19, 2022
Data Consistency for Magnetic Resonance Imaging

Data Consistency for Magnetic Resonance Imaging Data Consistency (DC) is crucial for generalization in multi-modal MRI data and robustness in detectin

Dimitris Karkalousos 19 Dec 12, 2022