当前位置:网站首页>HKU and NVIDIA | factuality enhanced language models for open ended text generation
HKU and NVIDIA | factuality enhanced language models for open ended text generation
2022-06-10 15:21:00 【Zhiyuan community】
author :Nayeon Lee , Wei Ping , Peng Xu , Mostofa Patwary , etc.
brief introduction : For the pre training language model (LM) Open text generation : This paper studies the improvement of large-scale LM The factual accuracy of . The author designed FactualityPrompts Test sets and metrics to measure LMs Authenticity . On this basis , The author studies the parameter size from 126M To 530B Of LMs The factual accuracy of . Interestingly , The author found that the larger LM The relatively small LM Is more in line with the facts , Although a previous study showed that , The larger LM Misunderstandings may not be true . Besides , A popular sampling algorithm in open text generation ( for example ,top-p) May be due to the introduction of “ Uniform randomness ” To the detriment of factuality . The author proposes a fact kernel sampling algorithm , The algorithm adapts to randomness dynamically to improve the authenticity of generation , While maintaining quality . Besides , The author analyzes the standard training method from the fact text corpus ( for example , Wikipedia ) Inefficiency in learning the correct association between entities in . The author puts forward a factual reinforcement training method , This method uses TopicPrefix Better understanding of facts and sentence completion as training objectives , Can greatly reduce factual errors .


Paper download :https://arxiv.org/pdf/2206.04624.pdf
边栏推荐
- . Net C Foundation (7): interface - how people interact with cats
- 面试题详情
- 2022 the 14th Nanjing International artificial intelligence product exhibition
- How the autorunner automated test tool creates a project -alltesting | Zezhong cloud test
- opencv神经网络库之SVM和ANN_MLP的使用
- rk3399_ 9.0 first level menu Network & Internet without setting
- 小程序实现全局数据共享
- Day10/11 recursion / backtracking
- TensorFlow实战Google深度学习框架第二版学习总结-TensorFlow安装
- RSA a little bit of thought
猜你喜欢

After class assignment for module 8 of phase 6 of the construction practice camp

Remote monitoring and data acquisition solution

How to build a customer-centric product blueprint: suggestions from the chief technology officer

Information theory and coding 2 final review BCH code

VINS理论与代码详解0——理论基础白话篇

共创地市价值空间,2022年华为商业分销地市百城行·宁波站成功举办

In what scenario can we not use the arrow function?

如何構建以客戶為中心的產品藍圖:來自首席技術官的建議

Applet network request promise

视觉SLAM常见的QR分解SVD分解等矩阵分解方式求解满秩和亏秩最小二乘问题(最全的方法分析总结)
随机推荐
数据库创建触发器的问题
二分查找详解
Several reasons and solutions of virtual machine Ping failure
一文带你了解J.U.C的FutureTask、Fork/Join框架和BlockingQueue
虚拟机ping不通的几种原因及解决办法
洞察的力量
Net core Tianma XingKong series - Interface Implementation for dependency injection and mutual conversion of database tables and C entity classes
竟然还有人说ArrayList是2倍扩容,今天带你手撕ArrayList源码
CANN的接口调用流程概述
ADA logics:cri-o overall safety audit project
微信小程序 颜色渐变
rk3399_9.0去掉设置的一级菜单network & internet
Jiabo gp2120tu label printer installation and use tutorial (PC)
Jaeger introduces native support for opentelemetry
如何构建以客户为中心的产品蓝图:来自首席技术官的建议
ORB_SLAM2视觉惯性紧耦合定位技术路线与代码详解0——整体框架与理论基础知识
在什么场景下,我们不能使用箭头函数?
面试题详情
Super practical operation! Calibration and registration of Kinect depth map and RGB camera for hands-on teaching
Development of stm8s103f single chip microcomputer (1) lighting of LED lamp