当前位置:网站首页>Position encoding practice in transformer
Position encoding practice in transformer
2022-07-04 16:39:00 【Beginner Chris】
In recent years ,transformer Because it can realize parallel computing and solve the dependency problem of long sequences, it is in nlp Areas and cv The field is brilliant .
The schematic diagram is as follows :
Here we mainly focus on a small part , namely position encoding part , because transformer Eliminate circular dependency , In order to reflect the location attribute , So encode the position of each element .
The code is as follows , As for why it is written like this , You can refer to the author's original , Or refer to the article .https://zhuanlan.zhihu.com/p/338592312
The code is as follows :
class PositionalEncoding(torch.nn.Module):
def __init__(self, d_model, max_len=5000):
super(PositionalEncoding, self).__init__()
pe = torch.zeros(max_len, d_model)
position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
pe[:, 0::2] = torch.sin(position * div_term)
pe[:, 1::2] = torch.cos(position * div_term)
pe = pe.unsqueeze(0).transpose(0, 1)#(max-len,1,d_model)
self.register_buffer('pe', pe)
def forward(self, x):
x = x + self.pe[:x.size(1), :].squeeze(1)
#x = x + self.pe[:x.size(1), :]
return x
In order to test , We define two input matrices , Full respectively 0、 whole 1tensor.
d_model = 4
a=torch.zeros(2,3,4)
pos=PositionalEncoding(d_model)
b=pos(a)
c=torch.ones(2,3,4)
b1=pos(c)
Obviously , The input matrix is

Output is b,b1, As shown below :;


It can be seen that , Are based on input , Add a fixed value , And those fixed values are encoded , It's not about input , And d_model of ,d_model It can be understood as a word embedding size .
边栏推荐
- Model fusion -- stacking principle and Implementation
- ~89 deformation translation
- How was MP3 born?
- Unity prefab day04
- How to save the contents of div as an image- How to save the contents of a div as a image?
- What does IOT engineering learn and work for?
- Essential basic knowledge of digital image processing
- [native JS] optimized text rotation effect
- Variable cannot have type 'void'
- 科普达人丨一文看懂阿里云的秘密武器“神龙架构”
猜你喜欢

Interface fonctionnelle, référence de méthode, Widget de tri de liste implémenté par lambda

PR FAQ: how to set PR vertical screen sequence?

A trap used by combinelatest and a debouncetime based solution
![[native JS] optimized text rotation effect](/img/50/3c09f223e821c14e7e9e0fb47622b6.jpg)
[native JS] optimized text rotation effect

Actual combat | use composite material 3 in application
![[North Asia data recovery] a database data recovery case where the disk on which the database is located is unrecognized due to the RAID disk failure of HP DL380 server](/img/79/3fab19045e1ab2f5163033afaa4309.jpg)
[North Asia data recovery] a database data recovery case where the disk on which the database is located is unrecognized due to the RAID disk failure of HP DL380 server

多年锤炼,迈向Kata 3.0 !走进开箱即用的安全容器体验之旅| 龙蜥技术

Communication mode based on stm32f1 single chip microcomputer

How to decrypt worksheet protection password in Excel file

What is torch NN?
随机推荐
Can I "reverse" a Boolean value- Can I 'invert' a bool?
Unity animation day05
Anta is actually a technology company? These operations fool netizens
error: ‘connect‘ was not declared in this scope connect(timer, SIGNAL(timeout()), this, SLOT(up
What should ABAP do when it calls a third-party API and encounters garbled code?
System.currentTimeMillis() 和 System.nanoTime() 哪个更快?别用错了!
@EnableAspectAutoJAutoProxy_ Exposeproxy property
Hair growth shampoo industry Research Report - market status analysis and development prospect forecast
CMPSC311 Linear Device
[native JS] optimized text rotation effect
Recommend 10 excellent mongodb GUI tools
TypeError: list indices must be integers or slices, not str
Market trend report, technical innovation and market forecast of taillight components in China
同构图与异构图CYPHER-TASK设计与TASK锁机制
Interface test - knowledge points and common interview questions
Opencv learning -- geometric transformation of image processing
Understand asp Net core - Authentication Based on jwtbearer
China's plastic processing machinery market trend report, technological innovation and market forecast
实战:fabric 用户证书吊销操作流程
Expression #1 of ORDER BY clause is not in SELECT list, references column ‘d.dept_ no‘ which is not i