当前位置:网站首页>Transformer中position encoding实践
Transformer中position encoding实践
2022-07-04 14:54:00 【初学者chris】
近年来,transformer由于其可以实现并行计算且可以解决长序列的依赖问题在nlp领域和cv领域大放异彩。
原理图如下所示:
这里我们主要关注一个小部分,即position encoding部分,因为transformer取消了循环依赖,为了体现位置属性,所以给每个元素进行位置编码。
代码如下所示,至于为什么会这么写,可以参考作者原文,或者参考一下文章。https://zhuanlan.zhihu.com/p/338592312
代码如下:
class PositionalEncoding(torch.nn.Module):
def __init__(self, d_model, max_len=5000):
super(PositionalEncoding, self).__init__()
pe = torch.zeros(max_len, d_model)
position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
pe[:, 0::2] = torch.sin(position * div_term)
pe[:, 1::2] = torch.cos(position * div_term)
pe = pe.unsqueeze(0).transpose(0, 1)#(max-len,1,d_model)
self.register_buffer('pe', pe)
def forward(self, x):
x = x + self.pe[:x.size(1), :].squeeze(1)
#x = x + self.pe[:x.size(1), :]
return x
为了测试,我们定义两个输入矩阵,分别为全0、全1tensor。
d_model = 4
a=torch.zeros(2,3,4)
pos=PositionalEncoding(d_model)
b=pos(a)
c=torch.ones(2,3,4)
b1=pos(c)
很明显,输入矩阵为

输出为b,b1,如下所示:;


可以看出,都是在输入的基础之上,加上了固定值,而那些固定值就是编码得到的,与输入无关,与d_model有关,d_model可以理解为单词的embedding大小。
边栏推荐
- Understand the rate control mode rate control mode CBR, VBR, CRF (x264, x265, VPX)
- What encryption algorithm is used for the master password of odoo database?
- AI system content recommendation issue 24
- 2021 Google vulnerability reward program review
- MySQL - MySQL adds self incrementing IDs to existing data tables
- [book club issue 13] packaging format and coding format of audio files
- Accounting regulations and professional ethics [8]
- Ten clothing stores have nine losses. A little change will make you buy every day
- DC-2靶场搭建及渗透实战详细过程(DC靶场系列)
- Hair and fuzz interceptor Industry Research Report - market status analysis and development prospect forecast
猜你喜欢

~89 deformation translation

What should ABAP do when it calls a third-party API and encounters garbled code?

The new generation of domestic ORM framework sagacity sqltoy-5.1.25 release

嵌入式软件架构设计-函数调用
Application of clock wheel in RPC

AI system content recommendation issue 24
![[North Asia data recovery] a database data recovery case where the partition where the database is located is unrecognized due to the RAID disk failure of HP DL380 server](/img/21/513042008483cf21fc66729ae1d41f.jpg)
[North Asia data recovery] a database data recovery case where the partition where the database is located is unrecognized due to the RAID disk failure of HP DL380 server

Intranet penetrating FRP: hidden communication tunnel technology

@EnableAspectAutoJAutoProxy_ Exposeproxy property

Model fusion -- stacking principle and Implementation
随机推荐
Model fusion -- stacking principle and Implementation
Redis: SDS source code analysis
Actual combat | use composite material 3 in application
Expression #1 of ORDER BY clause is not in SELECT list, references column ‘d.dept_ no‘ which is not i
[book club issue 13] packaging format and coding format of audio files
One question per day 540 A single element in an ordered array
I let the database lock the table! Almost fired!
[Chongqing Guangdong education] National Open University spring 2019 1248 public sector human resource management reference questions
Interface test - knowledge points and common interview questions
Unity script API - GameObject game object, object object
How can floating point numbers be compared with 0?
Logstash ~ detailed explanation of logstash configuration (logstash.yml)
Principle and general steps of SQL injection
多年锤炼,迈向Kata 3.0 !走进开箱即用的安全容器体验之旅| 龙蜥技术
Research Report on market supply and demand and strategy of China's Sodium Tetraphenylborate (cas+143-66-8) industry
Salient map drawing based on OpenCV
CMPSC311 Linear Device
Big God explains open source buff gain strategy live broadcast
Market trend report, technical innovation and market forecast of tetrabromophthalate (pht4 diol) in China
~89 deformation translation