当前位置:网站首页>tf. keras. layers. Attention understanding summary
tf. keras. layers. Attention understanding summary
2022-06-30 09:46:00 【A grain of sand in the vast sea of people】
The official link :https://tensorflow.google.cn/versions/r2.1/api_docs/python/tf/keras/layers/Attention
tf.keras.layers.Attention(
use_scale=False, **kwargs
)
Inputs are query
tensor of shape [batch_size, Tq, dim]
, value
tensor of shape [batch_size, Tv, dim]
and key
tensor of shape [batch_size, Tv, dim]
. The calculation follows the steps:
- Calculate scores with shape
[batch_size, Tq, Tv]
as aquery
-key
dot product:scores = tf.matmul(query, key, transpose_b=True)
. - Use scores to calculate a distribution with shape
[batch_size, Tq, Tv]
:distribution = tf.nn.softmax(scores)
. - Use
distribution
to create a linear combination ofvalue
with shapebatch_size, Tq, dim]
:return tf.matmul(distribution, value)
.
Example 1
import tensorflow as tf
import numpy as np
query = tf.convert_to_tensor(np.asarray([[[1., 1., 1., 3.]]]))
key_list = tf.convert_to_tensor(np.asarray([[[1., 1., 2., 4.], [4., 1., 1., 3.], [1., 1., 2., 1.]],
[[1., 0., 2., 1.], [1., 2., 1., 2.], [1., 0., 2., 1.]]]))
query_value_attention_seq = tf.keras.layers.Attention()([query, key_list])
print('query shape:', query.shape)
print('key shape:', key_list.shape)
print('result 1:',query_value_attention_seq)
result :
query shape: (1, 1, 4)
key shape: (2, 3, 4)
result 1: tf.Tensor(
[[[1.8067516 1. 1.7310829 3.730812 ]]
[[0.99999994 1.9293262 1.0353367 1.9646629 ]]], shape=(2, 1, 4), dtype=float32)
Implement by yourself according to the steps mentioned in the document
scores = tf.matmul(query, key_list, transpose_b=True)
distribution = tf.nn.softmax(scores)
result = tf.matmul(distribution, key_list)
print('result 2:',query_value_attention_seq)
give the result as follows : We can see that the result is the same as we understand
result 2: tf.Tensor(
[[[1.8067516 1. 1.7310829 3.730812 ]]
[[0.99999994 1.9293262 1.0353367 1.9646629 ]]], shape=(2, 1, 4), dtype=float32)
边栏推荐
- JWT expiration processing - single token scheme
- thrift简单使用
- Express - static resource request
- 小程序开发踩坑之旅
- ACM intensive training graph theory exercise 3 in the summer vacation of 2020 [problem solving]
- 11. customize hooks
- Redis + MySQL implements the like function
- 1. Basic configuration
- Tclistener server and tcpclient client use -- socket listening server and socketclient use
- Why won't gold depreciate???
猜你喜欢
Linear-gradient()
桂林 穩健醫療收購桂林乳膠100%股權 填補乳膠產品線空白
Idea shortcut key settings
Dart development skills
Recommend a very easy-to-use network communication framework HP socket
MySQL优化
Acquisition de 100% des actions de Guilin latex par Guilin Robust Medical pour combler le vide de la gamme de produits Latex
AutoUpdater. Net client custom update file
Eight sorts (II)
CentOS MySQL installation details
随机推荐
Challenge transform() 2D
桂林 穩健醫療收購桂林乳膠100%股權 填補乳膠產品線空白
OCX child thread cannot trigger event event (forward)
Startup of MySQL green edition in Windows system
Enum demo
Differences between the notify(), notifyall(), notifydatasetchanged(), notifydatasetinvalidated() methods in the adapter
Xlnet (generalized autorefressive trainingfor language understanding) paper notes
[ubuntu-mysql 8 installation and master-slave replication]
云技能提升好伙伴,亚马逊云师兄今天正式营业
GPT (improving language understanding generative pre training) paper notes
八大排序(一)
Tclistener server and tcpclient client use -- socket listening server and socketclient use
I once met a girl whom I most wanted to take care of all my life. Later... No later
Research on lg1403 divisor
Why won't gold depreciate???
ABAP time function
MySQL-- Entity Framework Code First(EF Code First)
utils 协程
Framework program of browser self-service terminal based on IE kernel
Alibaba billion concurrent projects in architecture