当前位置:网站首页>Transformer中position encoding实践
Transformer中position encoding实践
2022-07-04 14:54:00 【初学者chris】
近年来,transformer由于其可以实现并行计算且可以解决长序列的依赖问题在nlp领域和cv领域大放异彩。
原理图如下所示:
这里我们主要关注一个小部分,即position encoding部分,因为transformer取消了循环依赖,为了体现位置属性,所以给每个元素进行位置编码。
代码如下所示,至于为什么会这么写,可以参考作者原文,或者参考一下文章。https://zhuanlan.zhihu.com/p/338592312
代码如下:
class PositionalEncoding(torch.nn.Module):
def __init__(self, d_model, max_len=5000):
super(PositionalEncoding, self).__init__()
pe = torch.zeros(max_len, d_model)
position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
pe[:, 0::2] = torch.sin(position * div_term)
pe[:, 1::2] = torch.cos(position * div_term)
pe = pe.unsqueeze(0).transpose(0, 1)#(max-len,1,d_model)
self.register_buffer('pe', pe)
def forward(self, x):
x = x + self.pe[:x.size(1), :].squeeze(1)
#x = x + self.pe[:x.size(1), :]
return x
为了测试,我们定义两个输入矩阵,分别为全0、全1tensor。
d_model = 4
a=torch.zeros(2,3,4)
pos=PositionalEncoding(d_model)
b=pos(a)
c=torch.ones(2,3,4)
b1=pos(c)
很明显,输入矩阵为

输出为b,b1,如下所示:;


可以看出,都是在输入的基础之上,加上了固定值,而那些固定值就是编码得到的,与输入无关,与d_model有关,d_model可以理解为单词的embedding大小。
边栏推荐
- Using celery in projects
- System.currentTimeMillis() 和 System.nanoTime() 哪个更快?别用错了!
- Variable cannot have type 'void'
- 话里话外:流程图绘制初级:六大常见错误
- Statistical learning: logistic regression and cross entropy loss (pytoch Implementation)
- C language: implementation of daffodil number function
- Research Report on market supply and demand and strategy of tetramethylpyrazine industry in China
- [North Asia data recovery] a database data recovery case where the disk on which the database is located is unrecognized due to the RAID disk failure of HP DL380 server
- Accounting regulations and professional ethics [11]
- Talking about Net core how to use efcore to inject multiple instances of a context annotation type for connecting to the master-slave database
猜你喜欢

Stress, anxiety or depression? Correct diagnosis and retreatment

QT graphical view frame: element movement

Understand the rate control mode rate control mode CBR, VBR, CRF (x264, x265, VPX)

Anta is actually a technology company? These operations fool netizens

I let the database lock the table! Almost fired!

Penetration test --- database security: detailed explanation of SQL injection into database principle

DC-2靶场搭建及渗透实战详细过程(DC靶场系列)

对人胜率84%,DeepMind AI首次在西洋陆军棋中达到人类专家水平

Functional interface, method reference, list collection sorting gadget implemented by lambda
Application of clock wheel in RPC
随机推荐
Book of night sky 53 "stone soup" of Apache open source community
[book club issue 13] packaging format and coding format of audio files
Unity script API - transform transform
Software Engineer vs Hardware Engineer
Filtered off site request to
Accounting regulations and professional ethics [10]
Opencv learning -- geometric transformation of image processing
[tutorial] yolov5_ DeepSort_ The whole process of pytoch target tracking and detection
Vscode prompt Please install clang or check configuration 'clang executable‘
TypeError: list indices must be integers or slices, not str
Lv166 turned over
《吐血整理》保姆级系列教程-玩转Fiddler抓包教程(2)-初识Fiddler让你理性认识一下
How can floating point numbers be compared with 0?
D3D11_ Chili_ Tutorial (2): draw a triangle
Rearrange array
Statistical learning: logistic regression and cross entropy loss (pytoch Implementation)
c# 实现定义一套中间SQL可以跨库执行的SQL语句
China tall oil fatty acid market trend report, technical dynamic innovation and market forecast
Unity script API - GameObject game object, object object
Market trend report, technical innovation and market forecast of electrochromic glass and devices in China and Indonesia