当前位置:网站首页>超越PaLM!北大硕士提出DiVeRSe,全面刷新NLP推理排行榜
超越PaLM!北大硕士提出DiVeRSe,全面刷新NLP推理排行榜
2022-07-05 14:48:00 【智源社区】
最近,来自北大和微软的研究人员基于自洽的新方法DiVeRSe,包含三个主要的创新点,进一步提升了模型的推理能力。

论文链接:https://arxiv.org/abs/2206.02336
代码链接:https://github.com/microsoft/DiVeRSe
第一,受到自洽方式「想法不同,答案相同」的启发,即从语言模型中采样不同的推理路径,DiVeRSe在多样性上更进一步,按照「条条大路通罗马」的理念,使用多个prompt生成答案,能够生成更完整、互补的答案。
第二,在生成推理路径时,语言模型中并不存在一种机制来纠正先前步骤中的错误,可能会导致最终预测结果的混乱。DiVeRSe借鉴verifier的思想,对每个推理路径的正确性进行验证来引导投票机制。也就是说,并非所有的推理机制都是相等重要的或都是好的。
第三,由于答案是基于多个步骤的推理而产生的,当一个路径生成一个正确的答案时,可以认为所有的步骤都对最终的正确性做出了贡献。然而,当生成一个错误的答案时,这并不意味着所有的步骤都是错误的或对错误有贡献。
边栏推荐
- 你童年的快乐,都是被它承包了
- 裁员下的上海
- 想问下大家伙,有无是从腾讯云MYSQL同步到其他地方的呀?腾讯云MySQL存到COS上的binlog
- CPU设计相关笔记
- TS所有dom元素的类型声明
- [12 classic written questions of array and advanced pointer] these questions meet all your illusions about array and pointer, come on!
- Run faster with go: use golang to serve machine learning
- I collect multiple Oracle tables at the same time. After collecting for a while, I will report that Oracle's OGA memory is exceeded. Have you encountered it?
- Structure - C language
- B站做短视频,学抖音死,学YouTube生?
猜你喜欢

MongDB学习笔记

机器学习笔记 - 灰狼优化

There is a powerful and good-looking language bird editor, which is better than typora and developed by Alibaba

Differences between IPv6 and IPv4 three departments including the office of network information technology promote IPv6 scale deployment

Visual task scheduling & drag and drop | scalph data integration based on Apache seatunnel

Interpretation of Apache linkage parameters in computing middleware

Selection and use of bceloss, crossentropyloss, sigmoid, etc. in pytorch classification
![[summary of leetcode weekly competition] the 81st fortnight competition of leetcode (6.25)](/img/d7/f49bca8da2ce286c18508325985990.png)
[summary of leetcode weekly competition] the 81st fortnight competition of leetcode (6.25)

Mongdb learning notes

【NVMe2.0b 14-9】NVMe SR-IOV
随机推荐
启牛学堂班主任给的证券账户安全吗?能开户吗?
Machine learning notes - gray wolf optimization
【招聘岗位】基础设施软件开发人员
I want to inquire about how to ensure data consistency when a MySQL transaction updates multiple tables?
你童年的快乐,都是被它承包了
申请代码签名证书时如何选择合适的证书品牌?
PyTorch二分类时BCELoss,CrossEntropyLoss,Sigmoid等的选择和使用
TS所有dom元素的类型声明
What about SSL certificate errors? Solutions to common SSL certificate errors in browsers
微帧科技荣获全球云计算大会“云鼎奖”!
Postgresql 13 安装
How to open an account of qiniu securities? Is it safe to open an account?
Detailed explanation of usememo, memo, useref and other relevant hooks
Handwriting promise and async await
729. 我的日程安排表 I :「模拟」&「线段树(动态开点)」&「分块 + 位运算(分桶)」
Behind the ultra clear image quality of NBA Live Broadcast: an in-depth interpretation of Alibaba cloud video cloud "narrowband HD 2.0" technology
Structure - C language
[summary of leetcode weekly competition] the 81st fortnight competition of leetcode (6.25)
在Pytorch中使用Tensorboard可视化训练过程
Photoshop plug-in - action related concepts - actions in non loaded execution action files - PS plug-in development