Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Overview

Layerwise Anomaly

This repository contains the source code and data for our ACL 2021 paper: "How is BERT surprised? Layerwise detection of linguistic anomalies" by Bai Li, Zining Zhu, Guillaume Thomas, Yang Xu, and Frank Rudzicz.

Citation

If you use our work in your research, please cite:

Li, B., Zhu, Z., Thomas, G., Xu, Y., and Rudzicz, F. (2021) How is BERT surprised? Layerwise detection of linguistic anomalies. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL).

@inproceedings{li2021layerwise,
  author = "Li, Bai and Zhu, Zining and Thomas, Guillaume and Xu, Yang and Rudzicz, Frank",
  title = "How is BERT surprised? Layerwise detection of linguistic anomalies",
  booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)",
  publisher = "Association for Computational Linguistics",
  year = "2021",
}

Dependencies

The project was developed with the following library versions. Running with other versions may crash or produce incorrect results.

  • Python 3.7.5
  • CUDA Version: 11.0
  • torch==1.7.1
  • transformers==4.5.1
  • numpy==1.19.0
  • pandas==0.25.3
  • scikit-learn==0.22

Setup Instructions

  1. Clone this repo: git clone https://github.com/SPOClab-ca/layerwise-anomaly
  2. Download BNC Baby (4m word sample) from this link and extract into data/bnc/
  3. Run BNC preprocessing script: python scripts/process_bnc.py --bnc_dir=data/bnc/download/Texts --to=data/bnc.pkl
  4. Clone BLiMP repo: cd data && git clone https://github.com/alexwarstadt/blimp

GMM experiments on BLiMP (Figure 2 and Appendix A)

PYTHONPATH=. time python scripts/blimp_anomaly.py \
  --bnc_path=data/bnc.pkl \
  --blimp_path=data/blimp/data/ \
  --out=blimp_result

Frequency correlation (Figure 3 and Appendix B)

Run the notebooks/FreqSurprisal.ipynb notebook.

Surprisal gap experiments (Figure 4)

PYTHONPATH=. time python scripts/run_surprisal_gaps.py \
  --bnc_path=data/bnc.pkl \
  --out=surprisal_gaps

Accuracy scores (Table 2)

PYTHONPATH=. time python scripts/run_accuracy.py \
  --model_name=roberta-base \
  --anomaly_model=gmm

Run unit tests

PYTHONPATH=. pytest tests
pytorch implementation of GPV-Pose

GPV-Pose Pytorch implementation of GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting. (link) UPDATE A new version

40 Dec 01, 2022
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Nerdy Rodent 2.3k Jan 04, 2023
Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Unsupervised Contrastive Learning of Sound Event Representations This repository contains the code for the following paper. If you use this code or pa

Eduardo Fonseca 81 Dec 22, 2022
Pytorch implementation of MLP-Mixer with loading pre-trained models.

MLP-Mixer-Pytorch PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision with the function of loading official ImageNet pre-trained p

Qiushi Yang 2 Sep 29, 2022
Distributionally robust neural networks for group shifts

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization This code implements the g

151 Dec 25, 2022
This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

JigsawClustering Jigsaw Clustering for Unsupervised Visual Representation Learning Pengguang Chen, Shu Liu, Jiaya Jia Introduction This project provid

DV Lab 73 Sep 18, 2022
This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

12 Oct 28, 2022
MM1 and MMC Queue Simulation using python - Results and parameters in excel and csv files

implementation of MM1 and MMC Queue on randomly generated data and evaluate simulation results then compare with analytical results and draw a plot curve for them, simulate some integrals and compare

Mohamadreza Rezaei 1 Jan 19, 2022
Scaling Vision with Sparse Mixture of Experts

Scaling Vision with Sparse Mixture of Experts This repository contains the code for training and fine-tuning Sparse MoE models for vision (V-MoE) on I

Google Research 290 Dec 25, 2022
Official repo for AutoInt: Automatic Integration for Fast Neural Volume Rendering in CVPR 2021

AutoInt: Automatic Integration for Fast Neural Volume Rendering CVPR 2021 Project Page | Video | Paper PyTorch implementation of automatic integration

Stanford Computational Imaging Lab 149 Dec 22, 2022
Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically.

Experimenting with computer vision techniques to generate annotated image datasets from gameplay recordings automatically. The collected data will then be used to train a deep neural network that can

Martin Valchev 3 Apr 24, 2022
Second-Order Neural ODE Optimizer, NeurIPS 2021 spotlight

Second-order Neural ODE Optimizer (NeurIPS 2021 Spotlight) [arXiv] ✔️ faster convergence in wall-clock time | ✔️ O(1) memory cost | ✔️ better test-tim

Guan-Horng Liu 39 Oct 22, 2022
Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"

Adversarial Neuron Pruning Purifies Backdoored Deep Models Code for NeurIPS 2021 "Adversarial Neuron Pruning Purifies Backdoored Deep Models" by Dongx

Dongxian Wu 31 Dec 11, 2022
RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering Authors: Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou and

Salesforce 72 Dec 05, 2022
A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

jedibobo 3 Dec 28, 2022
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

ademxapp Visual applications by the University of Adelaide In designing our Model A, we did not over-optimize its structure for efficiency unless it w

Zifeng Wu 338 Dec 12, 2022
Testbed of AI Systems Quality Management

qunomon Description A testbed for testing and managing AI system qualities. Demo Sorry. Not deployment public server at alpha version. Requirement Ins

AIST AIRC 15 Nov 27, 2021
Some useful blender add-ons for SMPL skeleton's poses and global translation.

Blender add-ons for SMPL skeleton's poses and trans There are two blender add-ons for SMPL skeleton's poses and trans.The first is for making an offli

犹在镜中 154 Jan 04, 2023
The undersampled DWI image using Slice-Interleaved Diffusion Encoding (SIDE) method can be reconstructed by the UNet network.

UNet-SIDE The undersampled DWI image using Slice-Interleaved Diffusion Encoding (SIDE) method can be reconstructed by the UNet network. For Super Reso

TIANTIAN XU 1 Jan 13, 2022
Code for "Learning to Segment Rigid Motions from Two Frames".

rigidmask Code for "Learning to Segment Rigid Motions from Two Frames". ** This is a partial release with inference and evaluation code.

Gengshan Yang 157 Nov 21, 2022