Source code for "FastBERT: a Self-distilling BERT with Adaptive Inference Time".

Overview

FastBERT

Source code for "FastBERT: a Self-distilling BERT with Adaptive Inference Time".

Good News

2021/10/29 - Code: Code of FastPLM is released on both Pypi and Github.

2021/09/08 - Paper: Journal version of FastBERT (FastPLM) is accepted by IEEE TNNLS. "An Empirical Study on Adaptive Inference for Pretrained Language Model".

2020/07/05 - Update: Pypi version of FastBERT has been launched. Please see fastbert-pypi.

Install fastbert with pip

$ pip install fastbert

Requirements

python >= 3.4.0, Install all the requirements with pip.

$ pip install -r requirements.txt

Quick start on the Chinese Book review dataset

Download the pre-trained Chinese BERT parameters from here, and save it to the models directory with the name of "Chinese_base_model.bin".

Run the following command to validate our FastBERT with Speed=0.5 on the Book review datasets.

$ CUDA_VISIBLE_DEVICES="0" python3 -u run_fastbert.py \
        --pretrained_model_path ./models/Chinese_base_model.bin \
        --vocab_path ./models/google_zh_vocab.txt \
        --train_path ./datasets/douban_book_review/train.tsv \
        --dev_path ./datasets/douban_book_review/dev.tsv \
        --test_path ./datasets/douban_book_review/test.tsv \
        --epochs_num 3 --batch_size 32 --distill_epochs_num 5 \
        --encoder bert --fast_mode --speed 0.5 \
        --output_model_path  ./models/douban_fastbert.bin

Meaning of each option.

usage: --pretrained_model_path Path to initialize model parameters.
       --vocab_path Path to the vocabulary.
       --train_path Path to the training dataset.
       --dev_path Path to the validating dataset.
       --test_path Path to the testing dataset.
       --epochs_num The epoch numbers of fine-tuning.
       --batch_size Batch size.
       --distill_epochs_num The epoch numbers of the self-distillation.
       --encoder The type of encoder.
       --fast_mode Whether to enable the fast mode of FastBERT.
       --speed The Speed value in the paper.
       --output_model_path Path to the output model parameters.

Test results on the Book review dataset.

Test results at fine-tuning epoch 3 (Baseline): Acc.=0.8688;  FLOPs=21785247744;
Test results at self-distillation epoch 1     : Acc.=0.8698;  FLOPs=6300902177;
Test results at self-distillation epoch 2     : Acc.=0.8691;  FLOPs=5844839008;
Test results at self-distillation epoch 3     : Acc.=0.8664;  FLOPs=5170940850;
Test results at self-distillation epoch 4     : Acc.=0.8664;  FLOPs=5170940327;
Test results at self-distillation epoch 5     : Acc.=0.8664;  FLOPs=5170940327;

Quick start on the English Ag.news dataset

Download the pre-trained English BERT parameters from here, and save it to the models directory with the name of "English_uncased_base_model.bin".

Download the ag_news.zip from here, and then unzip it to the datasets directory.

Run the following command to validate our FastBERT with Speed=0.5 on the Ag.news datasets.

$ CUDA_VISIBLE_DEVICES="0" python3 -u run_fastbert.py \
        --pretrained_model_path ./models/English_uncased_base_model.bin \
        --vocab_path ./models/google_uncased_en_vocab.txt \
        --train_path ./datasets/ag_news/train.tsv \
        --dev_path ./datasets/ag_news/test.tsv \
        --test_path ./datasets/ag_news/test.tsv \
        --epochs_num 3 --batch_size 32 --distill_epochs_num 5 \
        --encoder bert --fast_mode --speed 0.5 \
        --output_model_path  ./models/ag_news_fastbert.bin

Test results on the Ag.news dataset.

Test results at fine-tuning epoch 3 (Baseline): Acc.=0.9447;  FLOPs=21785247744;
Test results at self-distillation epoch 1     : Acc.=0.9308;  FLOPs=2172009009;
Test results at self-distillation epoch 2     : Acc.=0.9311;  FLOPs=2163471246;
Test results at self-distillation epoch 3     : Acc.=0.9314;  FLOPs=2108341649;
Test results at self-distillation epoch 4     : Acc.=0.9314;  FLOPs=2108341649;
Test results at self-distillation epoch 5     : Acc.=0.9314;  FLOPs=2108341649;

Datasets

More datasets can be downloaded from here.

Other implementations

There are some other excellent implementations of FastBERT.

Acknowledgement

This work is funded by 2019 Tencent Rhino-Bird Elite Training Program. Work done while this author was an intern at Tencent.

If you use this code, please cite this paper:

@inproceedings{weijie2020fastbert,
  title={{FastBERT}: a Self-distilling BERT with Adaptive Inference Time},
  author={Weijie Liu, Peng Zhou, Zhe Zhao, Zhiruo Wang, Haotang Deng, Qi Ju},
  booktitle={Proceedings of ACL 2020},
  year={2020}
}
Code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge.

Open Sesame This repository contains the code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge. Credits We built the project on t

9 Jul 24, 2022
Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

41 Jan 04, 2023
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

249 Dec 28, 2022
The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing

CSGStumpNet The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing Paper | Project page

Daxuan 39 Dec 26, 2022
Yolov3 pytorch implementation

YOLOV3 Pytorch实现 在bubbliiing大佬代码的基础上进行了修改,添加了部分注释。 预训练模型 预训练模型来源于bubbliiing。 链接:https://pan.baidu.com/s/1ncREw6Na9ycZptdxiVMApw 提取码:appk 训练自己的数据集 按照VO

4 Aug 27, 2022
a reimplementation of Holistically-Nested Edge Detection in PyTorch

pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please

Simon Niklaus 375 Dec 06, 2022
[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding Official Pytorch implementation of Negative Sample Matter

Multimedia Computing Group, Nanjing University 69 Dec 26, 2022
A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

Aboleth A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation [1] with stochastic gradient variational Bayes

Gradient Institute 127 Dec 12, 2022
Localizing Visual Sounds the Hard Way

Localizing-Visual-Sounds-the-Hard-Way Code and Dataset for "Localizing Visual Sounds the Hard Way". The repo contains code and our pre-trained model.

Honglie Chen 58 Dec 07, 2022
Rendering Point Clouds with Compute Shaders

Compute Shader Based Point Cloud Rendering This repository contains the source code to our techreport: Rendering Point Clouds with Compute Shaders and

Markus Schütz 460 Jan 05, 2023
Repositório da disciplina de APC, no segundo semestre de 2021

NOTAS FINAIS: https://github.com/fabiommendes/apc2018/blob/master/nota-final.pdf Algoritmos e Programação de Computadores Este é o Git da disciplina A

16 Dec 16, 2022
Paper Code:A Self-adaptive Weighted Differential Evolution Approach for Large-scale Feature Selection

1. SaWDE.m is the main function 2. DataPartition.m is used to randomly partition the original data into training sets and test sets with a ratio of 7

wangxb 14 Dec 08, 2022
This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

GPRGNN This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network. Hidden state feature extraction i

Jianhao 92 Jan 03, 2023
Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

English: README-EN.md VRCWatch VRCWatch は、VRChat 内のアバター向けに現在時刻を送信するためのプログラムです。 使

Kosaki Mezumona 17 Nov 30, 2022
Code for our paper 'Generalized Category Discovery'

Generalized Category Discovery This repo is a placeholder for code for our paper: Generalized Category Discovery Abstract: In this paper, we consider

107 Dec 28, 2022
Revisting Open World Object Detection

Revisting Open World Object Detection Installation See INSTALL.md. Dataset Our n

58 Dec 23, 2022
Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

Suyash More 110 Dec 03, 2022
[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

DASR Pytorch implementation of "Unsupervised Degradation Representation Learning for Blind Super-Resolution", CVPR 2021 [arXiv] Overview Requirements

Longguang Wang 318 Dec 24, 2022
This is an official implementation for "PlaneRecNet".

PlaneRecNet This is an official implementation for PlaneRecNet: A multi-task convolutional neural network provides instance segmentation for piece-wis

yaxu 50 Nov 17, 2022
Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation".

I2V-GAN This repository is the official Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation". Traffic

69 Dec 31, 2022