当前位置:网站首页>Transformer中position encoding实践
Transformer中position encoding实践
2022-07-04 14:54:00 【初学者chris】
近年来,transformer由于其可以实现并行计算且可以解决长序列的依赖问题在nlp领域和cv领域大放异彩。
原理图如下所示:
这里我们主要关注一个小部分,即position encoding部分,因为transformer取消了循环依赖,为了体现位置属性,所以给每个元素进行位置编码。
代码如下所示,至于为什么会这么写,可以参考作者原文,或者参考一下文章。https://zhuanlan.zhihu.com/p/338592312
代码如下:
class PositionalEncoding(torch.nn.Module):
def __init__(self, d_model, max_len=5000):
super(PositionalEncoding, self).__init__()
pe = torch.zeros(max_len, d_model)
position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
pe[:, 0::2] = torch.sin(position * div_term)
pe[:, 1::2] = torch.cos(position * div_term)
pe = pe.unsqueeze(0).transpose(0, 1)#(max-len,1,d_model)
self.register_buffer('pe', pe)
def forward(self, x):
x = x + self.pe[:x.size(1), :].squeeze(1)
#x = x + self.pe[:x.size(1), :]
return x
为了测试,我们定义两个输入矩阵,分别为全0、全1tensor。
d_model = 4
a=torch.zeros(2,3,4)
pos=PositionalEncoding(d_model)
b=pos(a)
c=torch.ones(2,3,4)
b1=pos(c)
很明显,输入矩阵为
输出为b,b1,如下所示:;
可以看出,都是在输入的基础之上,加上了固定值,而那些固定值就是编码得到的,与输入无关,与d_model有关,d_model可以理解为单词的embedding大小。
边栏推荐
- Statistical learning: logistic regression and cross entropy loss (pytoch Implementation)
- [tutorial] yolov5_ DeepSort_ The whole process of pytoch target tracking and detection
- Can I "reverse" a Boolean value- Can I 'invert' a bool?
- Feature extraction and detection 15-akaze local matching
- Selenium browser (2)
- Essential basic knowledge of digital image processing
- Using celery in projects
- Go deep into the details of deconstruction and assignment of several data types in JS
- Research Report on market supply and demand and strategy of tetramethylpyrazine industry in China
- Cut! 39 year old Ali P9, saved 150million
猜你喜欢
@EnableAspectAutoJAutoProxy_ Exposeproxy property
The 17 year growth route of Zhang Liang, an open source person, can only be adhered to if he loves it
PR FAQ: how to set PR vertical screen sequence?
[North Asia data recovery] data recovery case of database data loss caused by HP DL380 server RAID disk failure
AutoCAD - set color
Understand the rate control mode rate control mode CBR, VBR, CRF (x264, x265, VPX)
Unity animation day05
一图看懂ThreadLocal
Vscode prompt Please install clang or check configuration 'clang executable‘
Penetration test --- database security: detailed explanation of SQL injection into database principle
随机推荐
Accounting regulations and professional ethics [9]
How to decrypt worksheet protection password in Excel file
DIY a low-cost multi-functional dot matrix clock!
[book club issue 13] ffmpeg common methods for viewing media information and processing audio and video files
Stew in disorder
%F format character
China's plastic processing machinery market trend report, technological innovation and market forecast
What does IOT engineering learn and work for?
《吐血整理》保姆级系列教程-玩转Fiddler抓包教程(2)-初识Fiddler让你理性认识一下
Expression #1 of ORDER BY clause is not in SELECT list, references column ‘d.dept_ no‘ which is not i
c# 实现定义一套中间SQL可以跨库执行的SQL语句
TypeError: list indices must be integers or slices, not str
多年锤炼,迈向Kata 3.0 !走进开箱即用的安全容器体验之旅| 龙蜥技术
Laravel simply realizes Alibaba cloud storage + Baidu AI Cloud image review
D3D11_ Chili_ Tutorial (2): draw a triangle
I let the database lock the table! Almost fired!
error: ‘connect‘ was not declared in this scope connect(timer, SIGNAL(timeout()), this, SLOT(up
Go deep into the details of deconstruction and assignment of several data types in JS
Understand the rate control mode rate control mode CBR, VBR, CRF (x264, x265, VPX)
The 17 year growth route of Zhang Liang, an open source person, can only be adhered to if he loves it