当前位置:网站首页>Icml2022 | utility theory of sequential decision making
Icml2022 | utility theory of sequential decision making
2022-06-30 21:02:00 【Zhiyuan community】
Thesis link :https://arxiv.org/pdf/2206.12562.pdf
Large based on Transformer The model has shown superior performance in various naturallanguageprocessing and computer vision tasks . However , These models contain a large number of parameters , This limits their deployment in real applications . To reduce the size of the model , Researchers prune these models according to the importance score of the weight . However , These scores are usually estimated in small batches during training , Due to small batch sampling and complex training dynamics , This brings a lot of variability / uncertainty . Because of this uncertainty , Common pruning methods prune some key weights , Make training unstable , It is not conducive to generalization . To solve this problem , We proposed PLATON Algorithm , The algorithm uses the upper confidence limit of importance estimation (upper confidence bound, UCB) To capture the uncertainty of the importance score . Especially for the weight with low importance score but high uncertainty ,PLATON Tend to keep them and explore their capacity . We are in natural language understanding 、 Question answering and image classification are based on transformer A large number of experiments have been carried out on the model , To verify PLATON The effectiveness of the . It turns out that , At different sparsity levels ,PLATON The algorithm has been significantly improved .
边栏推荐
猜你喜欢
AVL balanced binary tree (I) - concept and C language implementation
Lumiprobe 聚乙二醇化和 PEG 接头丨碘-PEG3-酸研究
Go语学习笔记 - gorm使用 - 数据库配置、表新增 | Web框架Gin(七)
Lvalue reference and lvalue reference
MFC界面库BCGControlBar v33.0 - 桌面警报窗口、网格控件升级等
uniapp-富文本编辑器
二叉查找树(一) - 概念与C语言实现
B_QuRT_User_Guide(31)
阿里kube-eventer mysql sink简单使用记录
1.微信小程序页面跳转方法总结;2. navigateTo堆栈到十层不跳转问题
随机推荐
Label Contrastive Coding based Graph Neural Network for Graph Classification
Two skylines
偏向锁/轻量锁/重级锁锁锁更健康,上锁解锁到底是怎么完成实现的
Failed to configure a DataSource: ‘url‘ attribute is not specified and no embedded datasource could
SQL必需掌握的100个重要知识点:创建和操纵表
MySQL:SQL概述及数据库系统介绍 | 黑马程序员
Markdown笔记简明教程
Peking University ACM problems 1005:i think I need a houseboat
浅谈代码语言的魅力
MySQL introduction, detailed installation steps and usage | dark horse programmer
申请Vector 总线协议彩图壁纸挂画,非常棒哦!
Lvalue reference and lvalue reference
How do I get the largest K massive data
Oracle 数据库表结构 Excel 导出
vncserver: Failed command ‘/etc/X11/Xvnc-session‘: 256!
阿里kube-eventer mysql sink简单使用记录
个人开发的渗透测试工具Satania
Lumiprobe cell biology - dia, instructions for lipophilic tracer
Flinksql两个kafka 流可以进行join么?
Adobe Photoshop (PS) - script development - remove file bloated script