当前位置:网站首页>Transformer中position encoding实践
Transformer中position encoding实践
2022-07-04 14:54:00 【初学者chris】
近年来,transformer由于其可以实现并行计算且可以解决长序列的依赖问题在nlp领域和cv领域大放异彩。
原理图如下所示:
这里我们主要关注一个小部分,即position encoding部分,因为transformer取消了循环依赖,为了体现位置属性,所以给每个元素进行位置编码。
代码如下所示,至于为什么会这么写,可以参考作者原文,或者参考一下文章。https://zhuanlan.zhihu.com/p/338592312
代码如下:
class PositionalEncoding(torch.nn.Module):
def __init__(self, d_model, max_len=5000):
super(PositionalEncoding, self).__init__()
pe = torch.zeros(max_len, d_model)
position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
pe[:, 0::2] = torch.sin(position * div_term)
pe[:, 1::2] = torch.cos(position * div_term)
pe = pe.unsqueeze(0).transpose(0, 1)#(max-len,1,d_model)
self.register_buffer('pe', pe)
def forward(self, x):
x = x + self.pe[:x.size(1), :].squeeze(1)
#x = x + self.pe[:x.size(1), :]
return x
为了测试,我们定义两个输入矩阵,分别为全0、全1tensor。
d_model = 4
a=torch.zeros(2,3,4)
pos=PositionalEncoding(d_model)
b=pos(a)
c=torch.ones(2,3,4)
b1=pos(c)
很明显,输入矩阵为
输出为b,b1,如下所示:;
可以看出,都是在输入的基础之上,加上了固定值,而那些固定值就是编码得到的,与输入无关,与d_model有关,d_model可以理解为单词的embedding大小。
边栏推荐
- Essential basic knowledge of digital image processing
- Filtered off site request to
- Digital recognition system based on OpenCV
- Market trend report, technical innovation and market forecast of tetrabromophthalate (pht4 diol) in China
- Oracle监听器Server端与Client端配置实例
- [North Asia data recovery] a database data recovery case where the partition where the database is located is unrecognized due to the RAID disk failure of HP DL380 server
- Salient map drawing based on OpenCV
- PR FAQ: how to set PR vertical screen sequence?
- A trap used by combinelatest and a debouncetime based solution
- %F format character
猜你喜欢
DIY a low-cost multi-functional dot matrix clock!
Qt---error: ‘QObject‘ is an ambiguous base of ‘MyView‘
AI system content recommendation issue 24
Interface fonctionnelle, référence de méthode, Widget de tri de liste implémenté par lambda
How was MP3 born?
Penetration test --- database security: detailed explanation of SQL injection into database principle
Stress, anxiety or depression? Correct diagnosis and retreatment
Anta is actually a technology company? These operations fool netizens
Vscode prompt Please install clang or check configuration 'clang executable‘
Principle and general steps of SQL injection
随机推荐
error: ‘connect‘ was not declared in this scope connect(timer, SIGNAL(timeout()), this, SLOT(up
Redis' optimistic lock and pessimistic lock for solving transaction conflicts
Statistical learning: logistic regression and cross entropy loss (pytoch Implementation)
对人胜率84%,DeepMind AI首次在西洋陆军棋中达到人类专家水平
The new generation of domestic ORM framework sagacity sqltoy-5.1.25 release
2021 Google vulnerability reward program review
Review of Weibo hot search in 2021 and analysis of hot search in the beginning of the year
话里话外:流程图绘制初级:六大常见错误
. Net delay queue
Find numbers
Market trend report, technical innovation and market forecast of electrochromic glass and devices in China and Indonesia
L1-072 scratch lottery
Change the mouse pointer on ngclick - change the mouse pointer on ngclick
Accounting regulations and professional ethics [11]
Research Report on market supply and demand and strategy of China's plastics and polymer industry
Dry goods | fMRI standard reporting guidelines are fresh, come and increase your knowledge
Model fusion -- stacking principle and Implementation
The content of the source code crawled by the crawler is inconsistent with that in the developer mode
What does IOT engineering learn and work for?
Accounting regulations and professional ethics [9]