当前位置:网站首页>Label Semantic Aware Pre-training for Few-shot Text Classification
Label Semantic Aware Pre-training for Few-shot Text Classification
2022-07-03 10:11:00 【InfoQ】

brief introduction
Pictures and articles
The whole structure


pipeline
- Gold data : By an undisclosed benchmark Data set and a public data set . The annotation quality of manual annotation of gold data is higher .
- Silver data : It is also a public data set . Silver data is annotated with heuristic .
- Bronze data : Because tagged data is expensive , And the number is rare . Therefore, this paper also obtains pre training data from a large number of unlabeled data . Figure 2 is the framework used .
- Dialog intention filter : Is based on RoBERTa Of . It is to divide the dialogue into positive examples and negative examples . Because not all conversations have a certain intention . such as “ It's a beautiful day .” This sentence has no intention ; however “ It's a beautiful day , So I want to go to the park .” This sentence is with a clear intention . If you put a data label on the data without intention , It will adversely affect the pre training and subsequent downstream tasks . Therefore, the unintentional sentences should be removed .
- Intention generator : Because unlabeled data has no intention to label , So use a based on T The intention generator of five generates the intention of the corresponding dialogue sentence .
Pre training form


- The first is random mask. Randomize sentences mask. And then use T5 Generate mask Content of the label .
- Here is a training method similar to the downstream task , Intention classification . Enter a sentence , And then sort it out , Output such intention natural language tags .
- Finally, denoising . The input sequence is composed of a sentence and the corresponding tag , But the label mask It fell off . Output to guess mask What is the content of .
Experimental design

Fine tuning part
baselines:
- XLNet
- LM-BFF
- seq2eq-PTR
- T5
- T5(adapt)
Add

summary
Introduction
motivation
- Pre training models are often used to encode input efficiently , But there is little work to let the model access the information representation of the tag .
- Other work is to use tagging in the fine-tuning and prediction stages .
- “ gold ” and “ Silver ” Data scarcity
contribution
- Incorporate tag semantics into the generation model during pre training .
- Create from unlabeled noise data “ word - Intention ” Yes , For label semantic awareness pre training .( Used for processing “ bronze ” data , Create for unlabeled text “ dialogue - Intention ” Yes )
- Intention and subject classification data set SOTA.
Approach
data:
- Gold data : Unpublished data sets + PolyAI Banking, It's with label The data of
- Silver data : Heuristically labeled datasets WikiHow, It's with heuristicallyp-label The data of
- Bronze data : Pseudo tag data , Create from unlabeled data utterance-intent pairs. yes seudo-label data
For the processing of unlabeled data :
- Dialog filter :
- Not all conversations are intentional (goal、intent).
- In order to prevent unintentional statements from being labeled with intent, thereby creating toxic data that affects downstream tasks , So first classify the dialogue into two categories (“non-intentful/negative” and “intentful/positive” ).
- Use Multi-Domain Goal-Oriented Dialogue (MultiDoGO) Schema-guided Dialogue(SGD) Yes, based on RoBERTa The dialog classifier is adjusted .
- Intention generator :
- Use gold and silver data to fine tune T5, Then throw the filtered data into it to generate intention labels . It also produced 37% Tags that don't appear in the training set .
Preliminary training ——label denoising
Experimental setup
fine-tuning :
baselines:
- XLNet
- LM-BFF
- seq2eq-PTR
- T5
- T5(adapt)
边栏推荐
- 20220606数学:分数到小数
- LeetCode - 919. Full binary tree inserter (array)
- Interruption system of 51 single chip microcomputer
- pycharm 无法引入自定义包
- Installation and removal of MySQL under Windows
- 2312、卖木头块 | 面试官与狂徒张三的那些事(leetcode,附思维导图 + 全部解法)
- Opencv interview guide
- Opencv note 21 frequency domain filtering
- Octave instructions
- 使用sed替换文件夹下文件
猜你喜欢
Installation and removal of MySQL under Windows
Leetcode-112:路径总和
Leetcode interview question 17.20 Continuous median (large top pile + small top pile)
LeetCode - 508. 出现次数最多的子树元素和 (二叉树的遍历)
LeetCode - 919. 完全二叉树插入器 (数组)
yocto 技术分享第四期:自定义增加软件包支持
51 MCU tmod and timer configuration
Opencv Harris corner detection
03 fastjason solves circular references
使用密钥对的形式连接阿里云服务器
随机推荐
Application of 51 single chip microcomputer timer
Leetcode - 933 number of recent requests
4G module at command communication package interface designed by charging pile
Qcombox style settings
Gif image analysis drawing RGB to YUV table lookup method to reduce CPU occupancy
Opencv feature extraction sift
20220602数学:Excel表列序号
Leetcode interview question 17.20 Continuous median (large top pile + small top pile)
Sending and interrupt receiving of STM32 serial port
The data read by pandas is saved to the MySQL database
20220601数学:阶乘后的零
LeetCode - 1670 设计前中后队列(设计 - 两个双端队列)
CV learning notes convolutional neural network
Liquid crystal display
STM32 running lantern experiment - library function version
2312. Selling wood blocks | things about the interviewer and crazy Zhang San (leetcode, with mind map + all solutions)
2021-11-11 standard thread library
It is difficult to quantify the extent to which a single-chip computer can find a job
Dynamic layout management
Anaconda安装包 报错packagesNotFoundError: The following packages are not available from current channels: