当前位置:网站首页>Naacl 2022 | TAMT: search the transportable Bert subnet through downstream task independent mask training
Naacl 2022 | TAMT: search the transportable Bert subnet through downstream task independent mask training
2022-06-27 13:50:00 【Zhiyuan community】
Paper title :Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training

chart 2 TAMT On pre training tasks (MLM Or knowledge distillation ) Learning sub network structure , Then migrate it to different downstream tasks for fine-tuning
Based on the above motives , We propose downstream task independent mask training (Task-Agnostic Mask Training,TAMT) Method . Pictured 2 Shown ,TAMT Optimize pre training tasks BERT The structure of the subnetwork ( Do not change the pre training parameter value ), So the sub network has better performance in the pre training task . Subsequently, the searched sub network will be migrated to a variety of downstream tasks for fine-tuning training .
边栏推荐
- [business security-04] universal user name and universal password experiment
- Yuweng information, a well-known information security manufacturer, joined the dragon lizard community to build an open source ecosystem
- 命令行编辑器 sed 基础用法总结
- 线程同步之信号量
- High efficiency exponentiation
- What is the difference between the FAT32 and NTFS formats on the USB flash disk
- JVM performance tuning and monitoring tools -- JPS, jstack, jmap, jhat, jstat, hprof
- Rereading the classic: the craft of research (1)
- CCID Consulting released the database Market Research Report on key application fields during the "14th five year plan" (attached with download)
- ensp云朵配置
猜你喜欢

CCID Consulting released the database Market Research Report on key application fields during the "14th five year plan" (attached with download)
![[OS command injection] common OS command execution functions and OS command injection utilization examples and range experiments - based on DVWA range](/img/f2/458770fc74971bef23f96f87733ee5.png)
[OS command injection] common OS command execution functions and OS command injection utilization examples and range experiments - based on DVWA range

American chips are hit hard again, and another chip enterprise after Intel will be overtaken by Chinese chips

How to set the compatibility mode of 360 speed browser

国产数据库乱象

Kyndryl与Oracle和Veritas达成合作

类模板中可变参的逐步展开

以前国产手机高傲定价扬言消费者爱买不买,现在猛降两千求售

How to solve the problem of missing language bar in win10 system

AXI總線
随机推荐
The second part of the travel notes of C (Part II) structural thinking: Zen is stable; all four advocate structure
Bidding announcement: Oracle all-in-one machine software and hardware maintenance project of Shanghai R & D Public Service Platform Management Center
Pytorch learning 1 (learning documents on the official website)
IJCAI 2022 | 用一行代码大幅提升零样本学习方法效果,南京理工&牛津提出即插即用分类器模块
How ASP connects Excel
万物互联时代到来,锐捷发布场景化无线零漫游方案
POSIX AIO -- glibc 版本异步 IO 简介
【问题解决】Tensorflow中run究竟运行了哪些节点?
Pytorch learning 3 (test training model)
mysql 锁机制与四种隔离级别
Axi bus
[安洵杯 2019]Attack
awk 简明教程
一道shell脚本的统计题
Summary of redis master-slave replication principle
招标公告:暨南大学附属第一医院Oracle数据库维保服务采购
事务的四大特性
快速搭建一个自己的访问国外网站,搭建ss并开启bbr快速上网
[business security 03] password retrieval business security and interface parameter account modification examples (based on the metinfov4.0 platform)
Implementing springboard agent through SSH port forwarding configuration