当前位置:网站首页>The worse the AI performance, the higher the bonus? Doctor of New York University offered a reward for the task of making the big model perform poorly
The worse the AI performance, the higher the bonus? Doctor of New York University offered a reward for the task of making the big model perform poorly
2022-07-06 23:16:00 【QbitAl】
Yi Pavilion From the Aofei temple
qubits | official account QbitAI
The bigger the model 、 The worse the performance, the better the prize ?
The total bonus is 25 Ten thousand dollars ( Renminbi conversion 167 ten thousand )?
such “ Out of line ” It really happened , A man named Inverse Scaling Prize( Anti scale effect Award ) The game of caused heated discussion on twitter .
The competition was organized by New York University 7 Jointly organized by researchers .
Originator Ethan Perez Express , The main purpose of this competition , It is hoped to find out which tasks will make the large model show anti scale effect , So as to find out some problems in the current large model pre training .
Now? , The competition is receiving contributions , The first round of submissions will end 2022 year 8 month 27 Japan .
Competition motivation
People seem to acquiesce , As the language model gets bigger , The operation effect will be better and better .
However , Large language models are not without flaws , For example, race 、 Gender and religious prejudice , And produce some fuzzy error messages .
The scale effect shows , With the number of parameters 、 The amount of computation used and the size of the data set increase , The language model will get better ( In terms of test losses and downstream performance ).
We assume that some tasks have the opposite trend : With the increase of language model testing loss , Task performance becomes monotonous 、 The effect becomes worse , We call this phenomenon anti scale effect , Contrary to the scale effect .
This competition aims to find more anti scale tasks , Analyze which types of tasks are prone to show anti scale effects , Especially those tasks that require high security .
meanwhile , The anti scale effect task will also help to study the potential problems in the current language model pre training and scale paradigm .
As language models are increasingly applied to real-world applications , The practical significance of this study is also increasing .
Collection of anti scale effect tasks , It will help reduce the risk of adverse consequences of large language models , And prevent harm to real users .
Netizen disputes
But for this competition , Some netizens put forward different views :
I think this is misleading . Because it assumes that the model is static , And stop after pre training .
This is more a problem of pre training on standard corpora with more parameters , Not the size of the model .
Software engineer James Agree with this view :
Yes , This whole thing is a hoax . Anything a small model can learn , Large models can also .
The deviation of the small model is larger , therefore “ Hot dogs are not hot dogs ” It may be recognized as 100% Right , When the big model realized that it could make cakes similar to hot dogs , The accuracy will drop to 98%.
James Even further proposed “ Conspiracy theories ” View of the :
Maybe the whole thing is a hoax —— Let people work hard , And show the training data when encountering difficult tasks , This experience will be absorbed by large models , Large models will eventually be better .
So they don't need to give bonuses , You will also get a better large-scale model .
Regarding this , Originator Ethan Perez Write in the comment :
Clarify it. , The focus of this award is to find language model pre training that will lead to anti scale effect , Never or rarely seen category .
This is just a way to use large models . There are many other settings that can lead to anti scale effects , Not included in our awards .
Rules of the game
According to the task submitted by the contestant , The team will build a system that contains at least 300 Sample datasets , And use GPT-3/OPT To test .
The competition will be selected by an anonymous jury .
The judges will start from the intensity of the anti scale effect 、 generality 、 Novelty 、 Reproducibility 、 Coverage and the importance of the task 6 There are three considerations , Conduct a comprehensive review of the submitted works , Finally, the first prize was awarded 、 Second and third prizes .
The bonus is set as follows :
The first prize is the most 1 position ,10 Ten thousand dollars ;
Most second prizes 5 position , Everyone 2 Ten thousand dollars ;
The third prize is the most 10 position , Everyone 5000 dollar .
The competition was held in 6 month 27 The day begins ,8 month 27 The first round of evaluation will be conducted on the th ,10 month 27 The second round of evaluation began on the th .
Originator Ethan Perez
Originator Ethan Perez Is a scientific researcher , Has been committed to the study of large-scale language models .
Perez Received a doctorate in natural language processing from New York University , Previously in DeepMind、Facebook AI Research、Mila( Montreal Institute of learning algorithms ) Worked with Google .
Reference link :
1、https://github.com/inverse-scaling/prize
2、https://twitter.com/EthanJPerez/status/1541454949397041154
3、https://alignmentfund.org/author/ethan-perez/
— End —
「 qubits · viewpoint 」 Live registration
What is? “ Intelligent decision making ”? What is the key technology of intelligent decision ? How will it build a leading enterprise for secondary growth “ Intelligent gripper ”?
7 month 7 On Thursday , Participate in the live broadcast , Answer for you ~
Focus on me here , Remember to mark the star ~
边栏推荐
- Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medi
- docker mysql5.7如何设置不区分大小写
- 今日睡眠质量记录78分
- B站大佬用我的世界搞出卷積神經網絡,LeCun轉發!爆肝6個月,播放破百萬
- Ajout, suppression et modification d'un tableau json par JS
- Pytest unit test series [v1.0.0] [pytest execute unittest test case]
- 不要再说微服务可以解决一切问题了
- 每人每年最高500万经费!选人不选项目,专注基础科研,科学家主导腾讯出资的「新基石」启动申报...
- AI表现越差,获得奖金越高?纽约大学博士拿出百万重金,悬赏让大模型表现差劲的任务...
- On the problems of born charge and non analytical correction in phonon and heat transport calculations
猜你喜欢
Financial professionals must read book series 6: equity investment (based on the outline and framework of the CFA exam)
B站大佬用我的世界搞出卷積神經網絡,LeCun轉發!爆肝6個月,播放破百萬
今日睡眠质量记录78分
【Unity】升级版·Excel数据解析,自动创建对应C#类,自动创建ScriptableObject生成类,自动序列化Asset文件
(flutter2) as import old project error: inheritfromwidgetofexacttype
docker启动mysql及-eMYSQL_ROOT_PASSWORD=my-secret-pw问题解决
Hard core observation 545 50 years ago, Apollo 15 made a feather landing experiment on the moon
Thinkphp5 multi table associative query method join queries two database tables, and the query results are spliced and returned
不要再说微服务可以解决一切问题了
浅谈网络安全之文件上传
随机推荐
Children's pajamas (Australia) as/nzs 1249:2014 handling process
面试题:AOF重写机制,redis面试必问!!!
MySQL中正则表达式(REGEXP)使用详解
Dockermysql modifies the root account password and grants permissions
How to choose indoor LED display? These five considerations must be taken into account
Hard core observation 545 50 years ago, Apollo 15 made a feather landing experiment on the moon
Automatically update selenium driver chromedriver
专为决策树打造,新加坡国立大学&清华大学联合提出快速安全的联邦学习新系统
With the help of this treasure artifact, I became the whole stack
华为云GaussDB(for Redis)揭秘第21期:使用高斯Redis实现二级索引
欧洲生物信息研究所2021亮点报告发布:采用AlphaFold已预测出近1百万个蛋白质
AcWing 4299. Delete point
企業不想換掉用了十年的老系統
(1) Chang'an chain learning notes - start Chang'an chain
dockermysql修改root账号密码并赋予权限
ICLR 2022 | pre training language model based on anti self attention mechanism
自动更新Selenium驱动chromedriver
Matlab tips (27) grey prediction
three. JS gorgeous bubble effect
Mysql 身份认证绕过漏洞(CVE-2012-2122)