当前位置:网站首页>CMU puts forward a new NLP paradigm - reconstructing pre training, and achieving 134 high scores in college entrance examination English
CMU puts forward a new NLP paradigm - reconstructing pre training, and achieving 134 high scores in college entrance examination English
2022-06-28 02:56:00 【Zhiyuan community】
What this article puts forward Reconstruction pre training (reStructured Pre-training,RST), Not only in various NLP Perform brilliantly on the task , In college entrance examination English , Also handed over a satisfactory result .
The way we store data is changing , From biological neural network to artificial neural network , In fact, the most common case is to use the brain to store data . With the growing amount of data available today , People are looking for different external devices to store data , Such as hard disk drive or cloud storage . With the rise of deep learning technology , Another promising storage technology has emerged , It uses artificial neural networks to store information in data .
Researchers believe , The ultimate goal of data storage is to better serve human life , The way data is accessed is as important as the way it is stored . However , There are differences in how data is stored and accessed . In the history of , People have been trying to bridge this gap , In order to make better use of the information that exists in the world . Pictured 3 Shown :

In biological neural networks ( Such as human brain ) aspect , Human beings are taught at a very young age ( Knowledge ) education , So that they can extract specific data to deal with the complex and changeable life .
For external device storage , People usually follow a certain pattern ( For example, form ) Structure the data , Then use a special language ( for example SQL) Effectively retrieve the required information from the database .
For storage based on artificial neural network , Researchers use self supervised learning to store data from large corpora ( Pre training ), The network is then used for various downstream tasks ( For example, emotional classification ).
come from CMU A new method for accessing data containing various types of information has been proposed by the researchers of , This information can be used as a pre training signal to guide the model to optimize parameters . The research shows the data structurally in the unit of signal . This is similar to the scenario of using a database to store data : First construct them into tables or JSON Format , In this way, we can use special language ( Such as SQL) Accurately retrieve the information you need .
Besides , This study believes that valuable signals are abundant in all kinds of data in the world , Rather than simply exist in manually managed supervisory data sets , What researchers need to do is (a) Identifying data (b) Reorganize data in a unified language (c) Integrate and store them in the pre training language model . This study calls this learning paradigm "reconstructive pre training" (reStructured Pre-training,RST). The researchers liken this process to 「 Mine treasure hunt 」. Different data sources like Wikipedia , It is equivalent to a mine rich in precious stones . They contain a wealth of information , For example, named entities from hyperlinks , It can provide signals for model pre training . A good pre training model (PLM) The composition of various signals in the data should be clearly understood , In order to provide accurate information according to the different needs of downstream tasks .

Paper title :
reStructured Pre-training
Thesis link :
https://arxiv.org/pdf/2206.11147.pdf

▲ Pre training language model Treasure Hunt
This study proposes a new paradigm of Task-based Learning in naturallanguageprocessing , namely RST, This paradigm re emphasizes the role of data , Model pre training and fine-tuning of downstream tasks are regarded as data storage and access processes . On this basis , This study implements a simple principle , That is, a good storage mechanism should not only have the ability to cache a large amount of data , Ease of access should also be considered .
After overcoming some engineering challenges , This research is based on the reconstruction of data ( It consists of all kinds of valuable information rather than raw data ) Pre training to achieve this . Experimental proof ,RST Models are not only coming from various NLP Mission ( For example, classification 、 Information extraction 、 Fact retrieval 、 Text generation, etc ) Of 52/55 Performance on popular datasets significantly exceeds that of the best existing systems ( for example ,T0), And there is no need to fine tune downstream tasks . Every year, millions of students take part in the most authoritative college entrance examination in China, and have also achieved excellent results .
To be specific , The college entrance examination AI (Qin) Higher than the student's average score 40 branch , Than using 1/16 Parametric GPT3 Higher than 15 branch . Special Qin stay 2018 In the English test, I got 138.5 The high score ( Full marks 150).
Besides , The study also released the college entrance examination benchmark (Gaokao Benchmark) Online submission platform , contain 2018-2021 So far this year 10 An annotated English test paper ( And will be expanded every year ), Make more AI The model takes part in the college entrance examination , The study also established a relatively fair human and AI Competitive test platforms , Help us better understand where we are . in addition , In a few days ago (2022.06.08) Of 2022 College entrance examination English test , The AI The system obtains 134 Good grades , and GPT3 Only got 108 branch .

The main contributions of this study include :
1. carry Out NLP The evolutionary hypothesis of the method . This study attempts to explore modern NLP The inner link between technological development , From the overall point of view 「NLP Technological evolution hypothesis 」. In short , The core idea of this hypothesis is : The iteration of technology always develops in this direction : That is, developers need to do less to design better 、 A more general-purpose system .

Reconstruction pre training


Reconstruction project


stay 55 Kind of commonly used NLP Experiments on datasets













边栏推荐
- SQL报了一个不常见的错误,让新来的实习生懵了
- Shardingsphere-proxy-5.0.0 establish MySQL read / write separation connection (6)
- [today in history] June 6: World IPv6 launch anniversary; Tetris release; Little red book established
- MFC common current path
- [elevator control system] design of elevator control system based on VHDL language and state machine, using state machine
- Packet capturing and sorting out external Fiddler -- understanding the toolbar [1]
- Usage details of staticlayout
- 如何判断线程池已经执行完所有任务了?
- > Could not create task ‘:app:MyTest.main()‘. > SourceSet with name ‘main‘ not found.问题修复
- [2D code image correction and enhancement] simulation of 2D code image correction and enhancement processing based on MATLAB
猜你喜欢
![[today in history] June 24: Netease was established; The first consumer electronics exhibition was held; The first webcast in the world](/img/f7/b3239802d19d00f760bb3174649a89.jpg)
[today in history] June 24: Netease was established; The first consumer electronics exhibition was held; The first webcast in the world

NER中BiLSTM-CRF解读Forward_algorithm

What if win11 can't drag an image to the taskbar software to open it quickly
![[today in history] June 20: the father of MP3 was born; Fujitsu was established; Google acquires dropcam](/img/54/df623fc1004e1dca5d369b4ed2608c.png)
[today in history] June 20: the father of MP3 was born; Fujitsu was established; Google acquires dropcam

STM32F1与STM32CubeIDE编程实例-金属触摸传感器驱动

【云原生】-Docker安装部署分布式数据库 OceanBase

【历史上的今天】6 月 15 日:第一个手机病毒;AI 巨匠司马贺诞生;Chromebook 发布

数仓的字符截取三胞胎:substrb、substr、substring

Desai wisdom number - histogram (column folding mixed graph): ratio of rental price to rental income in the graduation quarter of 2021

You got 8K in the 3-year function test, but were overtaken by the new tester. In fact, you are pretending to work hard
随机推荐
How fiddle uses agents
面试:Bitmap像素内存分配在堆内存还是在native中
Basic flask: template rendering + template filtering + control statement
StaticLayout的使用详解
Digital intelligence learning Lake Warehouse Integration Practice and exploration
[today in history] June 18: JD was born; The online store platform Etsy was established; Facebook releases Libra white paper
数据清洗工具flashtext,效率直接提升了几十倍数
Opencv -- geometric space transformation (affine transformation and projection transformation)
【历史上的今天】6 月 8 日:万维网之父诞生;PHP 公开发布;iPhone 4 问世
Review the submission of small papers for 2022 spring semester courses
A low-cost method to increase private domain traffic with simple maintenance
Flutter 使用 CustomPaint 绘制基本图形
[today in history] June 20: the father of MP3 was born; Fujitsu was established; Google acquires dropcam
Is it safe for qiniu to open an account? How do I open an account online?
【 amélioration de la correction d'image de Code bidimensionnel】 simulation du traitement d'amélioration de la correction d'image de Code bidimensionnel basée sur MATLAB
Step by step interpretation of crf+bilstm code
PHP 代码 微信、公众号、企业微信 发送表情符号 [U+1F449]
第一次使用gcc和makefile编写c程序
毕业季来临,2022届高校毕业生人数首次突破千万大关
MFC常用 当前路径