当前位置:网站首页>transformer坑了多少算力
transformer坑了多少算力
2022-07-04 05:13:00 【东方金木】
https://jishuin.proginn.com/p/763bfbd4ca4f
我最近研究发现了这个问题,一查还真有人说这个事情
这个论文间接的说明在残差中间的attention有可能没有必要
所以
使用了linner 代替这部分设计了一个transformer (留下相互解码的部分)
又设计了一个只使用(互相解码,其他的直接linner 解码后没有残差 )
结果就是后面的更好或者同样的效果但是效率没有只有MLP的高效在相同的任务上
也就是说残差基本也没有大作用
重点还是MLP
且双输出会比单输出要好
且softmax没有用,自注意力,本质是一个关系字典,如同新华字典一样
可以参考代码如下(乱了点)
https://blog.csdn.net/weixin_32759777/category_11446474.html
推理的时候屏蔽某一侧,这样方可互译使用
边栏推荐
- Thread pool: use thread pool to optimize query speed
- 2022危险化学品经营单位安全管理人员上岗证题库及答案
- Introduction To AMBA 简单理解
- Appearance of LabVIEW error dialog box
- ping端口神器psping
- Easy change
- June 2022 summary
- VB. Net calls ffmpeg to simply process video (class Library-6)
- [QT] create mycombobox click event
- Exercise bubble sort
猜你喜欢
TCP state transition diagram
【雕爷学编程】Arduino动手做(105)---压电陶瓷振动模块
Letter meaning and parameter abbreviation of optical module Daquan
C语言简易学生管理系统(含源码)
补某视频网站的js,进行视频解密
Void convolution, deformable convolution, deformable ROI pooling
2022 question bank and answers for safety management personnel of hazardous chemical business units
Etcd database source code analysis - initialization overview
Character types of C language
VB. Net simple processing pictures, black and white (class library - 7)
随机推荐
[技术发展-25]:广播电视网、互联网、电信网、电网四网融合技术
Letter meaning and parameter abbreviation of optical module Daquan
BUU-Crypto-[GUET-CTF2019]BabyRSA
光模塊字母含義及參數簡稱大全
Notepad++ -- display related configurations
BUU-Crypto-[HDCTF2019]basic rsa
Appearance of LabVIEW error dialog box
[matlab] matlab simulates digital baseband transmission system - digital baseband transmission system
What are the reasons for the frequent high CPU of ECS?
VB. Net calls ffmpeg to simply process video (class Library-6)
简易零钱通
LM small programmable controller software (based on CoDeSys) note 22: error 4268/4052
Analysis of classical pointer and array written test questions in C language
力扣(LeetCode)184. 部门工资最高的员工(2022.07.03)
Simulink and Arduino serial port communication
[matlab] matlab simulation modulation system - DSB system
1480. 一维数组的动态和
Just do it with your hands 7 - * project construction details 2 - hook configuration
Graduation design of small programs -- small programs of food and recipes
Simple g++ and GDB debugging