p-tuning for few-shot NLU task

Overview

p-tuning_NLU

Overview

这个小项目是受乐于分享的苏剑林大佬这篇p-tuning 文章启发,也实现了个使用P-tuning进行NLU分类的任务, 思路是一样的,prompt实现方式有不同,这里是将[unused*]的embeddings参数抽取出用于初始化prompt_embed后,再接一个lstm和mlp用于关联各prompt, 与最初p-tuning提出《GPT Understands, Too》 的实现一样,结果显示在few-shot上p-tuning非常接近finetune效果。

Dataset

数据是情感分类,下载地址百度网盘 提取码:osja

Evaluation

1. finetune

python few_shot_finetune.py

测试集效果:

epoch: 0 - acc: 0.897679 - best_test_acc: 0.8976788252013264
epoch: 1 - acc: 0.876362 - best_test_acc: 0.8976788252013264
epoch: 2 - acc: 0.884889 - best_test_acc: 0.8976788252013264
epoch: 3 - acc: 0.884415 - best_test_acc: 0.8976788252013264
epoch: 4 - acc: 0.884415 - best_test_acc: 0.8976788252013264

全量参数对小样本进行finetune,仅1个epoch就收敛了

2. p-tuning

python few_shot_ptuning.py

测试集效果:

epoch: 0 - acc: 0.546660 - best_test_acc: 0.5466603505447655
epoch: 1 - acc: 0.687826 - best_test_acc: 0.6878256750355282
epoch: 2 - acc: 0.737091 - best_test_acc: 0.7370914258645191
epoch: 3 - acc: 0.722406 - best_test_acc: 0.7370914258645191
epoch: 4 - acc: 0.776883 - best_test_acc: 0.7768829938417812
epoch: 5 - acc: 0.805306 - best_test_acc: 0.8053055423969683
epoch: 6 - acc: 0.833254 - best_test_acc: 0.8332543818095689
epoch: 7 - acc: 0.837991 - best_test_acc: 0.8379914732354334
epoch: 8 - acc: 0.854571 - best_test_acc: 0.8545712932259593
epoch: 9 - acc: 0.858361 - best_test_acc: 0.8583609663666508
epoch: 10 - acc: 0.856466 - best_test_acc: 0.8583609663666508
epoch: 11 - acc: 0.853150 - best_test_acc: 0.8583609663666508
epoch: 12 - acc: 0.868783 - best_test_acc: 0.8687825675035529
epoch: 13 - acc: 0.877309 - best_test_acc: 0.877309332070109
epoch: 14 - acc: 0.873993 - best_test_acc: 0.877309332070109
epoch: 15 - acc: 0.877783 - best_test_acc: 0.8777830412126955
epoch: 16 - acc: 0.882994 - best_test_acc: 0.8829938417811464
epoch: 17 - acc: 0.881573 - best_test_acc: 0.8829938417811464
epoch: 18 - acc: 0.889626 - best_test_acc: 0.8896257697773567
epoch: 19 - acc: 0.877783 - best_test_acc: 0.8896257697773567

仅prompt_embed和lstm及mlp去做p-tuning,20个epoch后接近收敛,acc=0.8896,略小于finetun的acc 0.8977

附上苏神结果对比:

img

Sequence model architectures from scratch in PyTorch

This repository implements a variety of sequence model architectures from scratch in PyTorch. Effort has been put to make the code well structured so that it can serve as learning material. The train

Brando Koch 11 Mar 28, 2022
Mesh TensorFlow: Model Parallelism Made Easier

Mesh TensorFlow - Model Parallelism Made Easier Introduction Mesh TensorFlow (mtf) is a language for distributed deep learning, capable of specifying

1.3k Dec 26, 2022
A flask application to predict the speech emotion of any .wav file.

This is a speech emotion recognition app. It will allow you to train a modular MLP model with the RAVDESS dataset, and then use that model with a flask application to predict the speech emotion of an

Aryan Vijaywargia 2 Dec 15, 2021
A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

Chi Han 43 Dec 28, 2022
This repository contains Python scripts for extracting linguistic features from Filipino texts.

Filipino Text Linguistic Feature Extractors This repository contains scripts for extracting linguistic features from Filipino texts. The scripts were

Joseph Imperial 1 Oct 05, 2021
Community and sentiment analysis based on tweets

The project has set itself the goal of analyzing the thoughts and interaction of Italian users through the social posts expressed through the Twitter platform on the day of the entry into force of th

3 Nov 17, 2022
Stanford CoreNLP provides a set of natural language analysis tools written in Java

Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv

Stanford NLP 8.8k Jan 07, 2023
LegalNLP - Natural Language Processing Methods for the Brazilian Legal Language

LegalNLP - Natural Language Processing Methods for the Brazilian Legal Language ⚖️ The library of Natural Language Processing for Brazilian legal lang

Felipe Maia Polo 125 Dec 20, 2022
State of the Art Natural Language Processing

Spark NLP: State of the Art Natural Language Processing Spark NLP is a Natural Language Processing library built on top of Apache Spark ML. It provide

John Snow Labs 3k Jan 05, 2023
A fast, efficient universal vector embedding utility package.

Magnitude: a fast, simple vector embedding utility library A feature-packed Python package and vector storage file format for utilizing vector embeddi

Plasticity 1.5k Jan 02, 2023
Fast, DB Backed pretrained word embeddings for natural language processing.

Embeddings Embeddings is a python package that provides pretrained word embeddings for natural language processing and machine learning. Instead of lo

Victor Zhong 212 Nov 21, 2022
Rootski - Full codebase for rootski.io (without the data)

📣 Welcome to the Rootski codebase! This is the codebase for the application run

Eric 20 Nov 18, 2022
Chinese Named Entity Recognization (BiLSTM with PyTorch)

BiLSTM-CRF for Name Entity Recognition PyTorch version A PyTorch implemention of Bi-LSTM-CRF model for Chinese Named Entity Recognition. 使用 PyTorch 实现

5 Jun 01, 2022
AMUSE - financial summarization

AMUSE AMUSE - financial summarization Unzip data.zip Train new model: python FinAnalyze.py --task train --start 0 --count how many files,-1 for all

1 Jan 11, 2022
Grover is a model for Neural Fake News -- both generation and detectio

Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.

Rowan Zellers 856 Dec 24, 2022
Python powered crossword generator with database with 20k+ polish words

crossword_generator Generate simple crossword puzzle from words and definitions fetched from krzyżowki.edu.pl endpoints -/ string:word - returns js

0 Jan 04, 2022
[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

Cambridge Language Technology Lab 61 Dec 10, 2022
This repo contains simple to use, pretrained/training-less models for speaker diarization.

PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o

12 Jan 20, 2022
Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Diaformer Diaformer: Automatic Diagnosis via Symptoms Sequence Generation (AAAI 2022) Diaformer is an efficient model for automatic diagnosis via symp

Junying Chen 20 Dec 13, 2022
Arabic speech recognition, classification and text-to-speech.

klaam Arabic speech recognition, classification and text-to-speech using many advanced models like wave2vec and fastspeech2. This repository allows tr

ARBML 177 Dec 27, 2022