Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

Overview

SentiBERT

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020). https://arxiv.org/abs/2005.04114

Model Architecture

Requirements

Environment

* Python == 3.6.10
* Pytorch == 1.1.0
* CUDA == 9.0.176
* NVIDIA GeForce GTX 1080 Ti
* HuggingFaces Pytorch (also known as pytorch-pretrained-bert & transformers)
* Stanford CoreNLP (stanford-corenlp-full-2018-10-05)
* Numpy, Pickle, Tqdm, Scipy, etc. (See requirements.txt)

Datasets

Datasets include:

* SST-phrase
* SST-5 (almost the same with SST-phrase)
* SST-3 (almost the same with SST-phrase)
* SST-2
* Twitter Sentiment Analysis (SemEval 2017 Task 4)
* EmoContext (SemEval 2019 Task 3)
* EmoInt (Joy, Fear, Sad, Anger) (SemEval 2018 Task 1c)

Note that there are no individual datasets for SST-5. When evaluating SST-phrase, the results for SST-5 should also appear.

File Architecture (Selected important files)

-- /examples/run_classifier_new.py                                  ---> start to train
-- /examples/run_classifier_dataset_utils_new.py                    ---> input preprocessed files to SentiBERT
-- /pytorch-pretrained-bert/modeling_new.py                         ---> detailed model architecture
-- /examples/lm_finetuning/pregenerate_training_data_sstphrase.py   ---> generate pretrained epochs
-- /examples/lm_finetuning/finetune_on_pregenerated_sstphrase.py    ---> pretrain on generated epochs
-- /preprocessing/xxx_st.py                                         ---> preprocess raw text and constituency tree
-- /datasets                                                        ---> datasets
-- /transformers (under construction)                               ---> RoBERTa part

Get Started

Preparing Environment

conda create -n sentibert python=3.6.10
conda activate sentibert

conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=9.0 -c pytorch

cd SentiBERT/

wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
unzip stanford-corenlp-full-2018-10-05.zip

export PYTHONPATH=$PYTHONPATH:XX/SentiBERT/pytorch_pretrained_bert
export PYTHONPATH=$PYTHONPATH:XX/SentiBERT/
export PYTHONPATH=$PYTHONPATH:XX/

Preprocessing

  1. Split the raw text and golden labels of sentiment/emotion datasets into xxx_train\dev\test.txt and xxx_train\dev\test_label.npy, assuming that xxx represents task name.
  2. Obtain tree information. There are totally three situtations.
  • For tasks except SST-phrase, SST-2,3,5, put the files into xxx_train\test.txt files into /stanford-corenlp-full-2018-10-05/. To get binary sentiment constituency trees, please run
cd /stanford-corenlp-full-2018-10-05
java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,parse,sentiment -file xxx_train\test.txt -outputFormat json -ssplit.eolonly true -tokenize.whitespace true

The tree information will be stored in /stanford-corenlp-full-2018-10-05/xxx_train\test.txt.json.

  • For SST-2, please use
cd /stanford-corenlp-full-2018-10-05
java -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,parse,sentiment -file sst2_train\dev_text.txt -outputFormat json -ssplit.eolonly true

The tree information will be stored in /stanford-corenlp-full-2018-10-05/sst2_train\dev_text.txt.json.

  • For SST-phrase and SST-3,5, the tree information was already stored in sstphrase_train\test.txt.
  1. Run /datasets/xxx/xxx_st.py to clean, and store the text and label information in xxx_train\dev\test_text_new.txt and xxx_label_train\dev\test.npy. It also transforms the tree structure into matrices /datasets/xxx/xxx_train\dev\test_span.npy and /datasets/xxx/xxx_train\dev\test_span_3.npy. The first matrix is used as the range of constituencies in the first layer of our attention mechanism. The second matrix is used as the indices of each constituency's children nodes or subwords and itself in the second layer. Specifically, for tasks other than EmoInt, SST-phrase, SST-5 and SST-3, the command is like below:
cd /preprocessing

python xxx_st.py \
        --data_dir /datasets/xxx/ \                         ---> the location where you want to store preprocessed text, label and tree information 
        --tree_dir /stanford-corenlp-full-2018-10-05/ \     ---> the location of unpreprocessed tree information (usually in Stanford CoreNLP repo)
        --stage train \                                     ---> "train", "test" or "dev"

For EmoInt, the command is shown below:

cd /preprocessing

python xxx_st.py \
        --data_dir /datasets/xxx/ \                         ---> the location where you want to store preprocessed text, label and tree information 
        --tree_dir /stanford-corenlp-full-2018-10-05/ \     ---> the location of unpreprocessed tree information (usually in Stanford CoreNLP repo)
        --stage train \                                     ---> "train" or "test"
        --domain joy                                        ---> "joy", "sad", "fear" or "anger". Used in EmoInt task

For SST-phrase, SST-5 and SST-3, since they already have tree information in sstphrase_train\test.txt. In this case, tree_dir should be /datasets/sstphrase/ or /datasets/sst-3/. The command is shown below:

cd /preprocessing

python xxx_st.py \
        --data_dir /datasets/xxx/ \                         ---> the location where you want to store preprocessed text, label and tree information 
        --tree_dir /datasets/xxx/ \                         ---> the location of unpreprocessed tree information    
        --stage train \                                     ---> "train" or "test"

Pretraining

  1. Generate epochs for preparation
cd /examples/lm_finetuning

python3 pregenerate_training_data_sstphrase.py \
        --train_corpus /datasets/sstphrase/sstphrase_train_text_new.txt \
        --data_dir /datasets/sstphrase/ \
        --bert_model bert-base-uncased \
        --do_lower_case \
        --output_dir /training_sstphrase \
        --epochs_to_generate 3 \
        --max_seq_len 128 \
  1. Pretrain the generated epochs
CUDA_VISIBLE_DEVICES=7 python3 finetune_on_pregenerated_sstphrase.py \
        --pregenerated_data /training_sstphrase \
        --bert_model bert-base-uncased \
        --do_lower_case \
        --output_dir /results/sstphrase_pretrain \
        --epochs 3

The pre-trained parameters were released here. [Google Drive]

Fine-tuning

Run run_classifier_new.py directly as follows:

cd /examples

CUDA_VISIBLE_DEVICES=7 python run_classifier_new.py \
  --task_name xxx \                              ---> task name
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir /datasets/xxx \                     ---> the same name as task_name
  --pretrain_dir /results/sstphrase_pretrain \   ---> the location of pre-trained parameters
  --bert_model bert-base-uncased \
  --max_seq_length 128 \
  --train_batch_size xxx \
  --learning_rate xxx \
  --num_train_epochs xxx \                                                          
  --domain xxx \                                 ---> "joy", "sad", "fear" or "anger". Used in EmoInt task
  --output_dir /results/xxx \                    ---> the same name as task_name
  --seed xxx \
  --para xxx                                     ---> "sentibert" or "bert": pretrained SentiBERT or BERT

Checkpoints

For reproducity and usability, we provide checkpoints and the original training settings to help you reproduce: Link of overall result folder: [Google Drive]

The implementation details and results are shown below:

Note: 1) BERT denotes BERT w/ Mean pooling. 2) The results of subtasks in EmoInt is (Joy: 68.90, 65.18, 4 epochs), (Anger: 68.17, 66.73, 4 epochs), (Sad: 66.25, 63.08, 5 epochs), (Fear: 65.49, 64.79, 5 epochs), respectively.

Models Batch Size Learning Rate Epochs Seed Results
SST-phrase
SentiBERT 32 2e-5 5 30 **68.98**
BERT* 32 2e-5 5 30 65.22
SST-5
SentiBERT 32 2e-5 5 30 **56.04**
BERT* 32 2e-5 5 30 50.23
SST-2
SentiBERT 32 2e-5 1 30 **93.25**
BERT 32 2e-5 1 30 92.08
SST-3
SentiBERT 32 2e-5 5 77 **77.34**
BERT* 32 2e-5 5 77 73.35
EmoContext
SentiBERT 32 2e-5 1 0 **74.47**
BERT 32 2e-5 1 0 73.64
EmoInt
SentiBERT 16 2e-5 4 or 5 77 **67.20**
BERT 16 2e-5 4 or 5 77 64.95
Twitter
SentiBERT 32 6e-5 1 45 **70.2**
BERT 32 6e-5 1 45 69.7

Analysis

Here we provide analysis implementation in our paper. We will focus on the evaluation of

  • local difficulty
  • global difficulty
  • negation
  • contrastive relation

In preprocessing part, we provide implementation to extract related information in the test set of SST-phrase and store them in

-- /datasets/sstphrase/swap_test_new.npy                   ---> global difficulty
-- /datasets/sstphrase/edge_swap_test_new.npy              ---> local difficulty
-- /datasets/sstphrase/neg_new.npy                         ---> negation
-- /datasets/sstphrase/but_new.npy                         ---> contrastive relation

In simple_accuracy_phrase(), we will provide statistical details and evaluate for each metric.

Some of the analysis results based on our provided checkpoints are selected and shown below:

Models Results
Local Difficulty
SentiBERT **[85.39, 60.80, 49.40]**
BERT* [83.00, 55.54, 31.97]
Negation
SentiBERT **[78.45, 76.25, 70.56]**
BERT* [75.04, 71.40, 68.77]
Contrastive Relation
SentiBERT **39.87**
BERT* 28.48

Acknowledgement

Here we would like to thank for BERT/RoBERTa implementation of HuggingFace and sentiment tree parser of Stanford CoreNLP. Also, thanks for the dataset release of SemEval. To confirm the privacy rule of SemEval task organizer, we only choose the publicable datasets of each task.

Citation

Please cite our ACL paper if this repository inspired your work.

@inproceedings{yin2020sentibert,
  author    = {Yin, Da and Meng, Tao and Chang, Kai-Wei},
  title     = {{SentiBERT}: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics},
  booktitle = {Proceedings of the 58th Conference of the Association for Computational Linguistics, {ACL} 2020, Seattle, USA},
  year      = {2020},
}

Contact

  • Due to the difference of environment, the results will be a bit different. If you have any questions regarding the code, please create an issue or contact the owner of this repository.
Owner
Da Yin
Da Yin
Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows

Probabilistic-Monocular-3D-Human-Pose-Estimation-with-Normalizing-Flows This is the official implementation of the ICCV 2021 Paper "Probabilistic Mono

62 Nov 23, 2022
Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Tom-R.T.Kvalvaag 2 Dec 17, 2021
This is code of book "Learn Deep Learning with PyTorch"

深度学习入门之PyTorch Learn Deep Learning with PyTorch 非常感谢您能够购买此书,这个github repository包含有深度学习入门之PyTorch的实例代码。由于本人水平有限,在写此书的时候参考了一些网上的资料,在这里对他们表示敬意。由于深度学习的技术在

Xingyu Liao 2.5k Jan 04, 2023
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

james 641 Jan 03, 2023
Repo for paper "Dynamic Placement of Rapidly Deployable Mobile Sensor Robots Using Machine Learning and Expected Value of Information"

Repo for paper "Dynamic Placement of Rapidly Deployable Mobile Sensor Robots Using Machine Learning and Expected Value of Information" Notes I probabl

Berkeley Expert System Technologies Lab 0 Jul 01, 2021
Racing line optimization algorithm in python that uses Particle Swarm Optimization.

Racing Line Optimization with PSO This repository contains a racing line optimization algorithm in python that uses Particle Swarm Optimization. Requi

Parsa Dahesh 6 Dec 14, 2022
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab 4 Jul 19, 2022
Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

TensorFlow White Paper Notes Features Notes broken down section by section, as well as subsection by subsection Relevant links to documentation, resou

Sam Abrahams 437 Oct 09, 2022
Everything's Talkin': Pareidolia Face Reenactment (CVPR2021)

Everything's Talkin': Pareidolia Face Reenactment (CVPR2021) Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, and Ran He [Paper], [Video

71 Dec 21, 2022
Code for SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes (NeurIPS 2021)

SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes (NeurIPS 2021) SyncTwin is a treatment effect estimation method tailored for observat

Zhaozhi Qian 3 Nov 03, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
[ICLR2021oral] Rethinking Architecture Selection in Differentiable NAS

DARTS-PT Code accompanying the paper ICLR'2021: Rethinking Architecture Selection in Differentiable NAS Ruochen Wang, Minhao Cheng, Xiangning Chen, Xi

Ruochen Wang 86 Dec 27, 2022
Simulation code and tutorial for BBHnet training data

Simulation Dataset for BBHnet NOTE: OLD README, UPDATE IN PROGRESS We generate simulation dataset to train BBHnet, our deep learning framework for det

0 May 31, 2022
State-to-Distribution (STD) Model

State-to-Distribution (STD) Model In this repository we provide exemplary code on how to construct and evaluate a state-to-distribution (STD) model fo

<a href=[email protected]"> 2 Apr 07, 2022
learning and feeling SLAM together with hands-on-experiments

modern-slam-tutorial-python Learning and feeling SLAM together with hands-on-experiments 😀 😃 😆 Dependencies Most of the examples are based on GTSAM

Giseop Kim 59 Dec 22, 2022
The author's officially unofficial PyTorch BigGAN implementation.

BigGAN-PyTorch The author's officially unofficial PyTorch BigGAN implementation. This repo contains code for 4-8 GPU training of BigGANs from Large Sc

Andy Brock 2.6k Jan 02, 2023
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

Xinyu Yi 261 Dec 31, 2022
EfficientMPC - Efficient Model Predictive Control Implementation

efficientMPC Efficient Model Predictive Control Implementation The original algo

Vin 8 Dec 04, 2022
Neural Ensemble Search for Performant and Calibrated Predictions

Neural Ensemble Search Introduction This repo contains the code accompanying the paper: Neural Ensemble Search for Performant and Calibrated Predictio

AutoML-Freiburg-Hannover 26 Dec 12, 2022
render sprites into your desktop environment as shaped windows using GTK

spritegtk render static or animated sprites into your desktop environment as dynamic shaped windows using GTK requires pycairo and PYGobject: pip inst

hermit 20 Oct 27, 2022