当前位置:网站首页>[5 minutes paper] Pointer network
[5 minutes paper] Pointer network
2022-07-26 15:21:00 【Little Mr He】
List of articles
The problem solved ?
Propose a network structure , Learn the positional relationship of input sequence .
background
Learning the positional relationship of input sequences can be regarded as seq2seq problem , The length of the output sequence is consistent with the length of the input sequence , And it's a variable . It can be used to deal with variable sequencing or combinatorial optimization problems .
This article is 2017 Published in , before this , For algorithms such as recurrent neural networks , Mainly based on a priori fixed output dictionary , That is, the predicted size of the output unit of the hidden layer is unknown .
The method adopted ?
Compared with the previous attention Used to code encoder The hidden state of the input is different , Proposed by the author Ptr-Net Is based on Attention To choose .

The core network structure in this paper is decoder part . There are two main points :
- In the past
seq2seqDifferent models ,decoderThe input of each time is actually the input information of the node selected at the previous time , It is the most original coordinate information encoded and then entersencoderInput information in . - In the calculation
decoderWhen the output of , AdoptedattentionThe calculation of is as follows : Input is[bs, hidden_dim, seq_len]OfcontextAnd the dimension is[bs, hidden_dim]Ofinput. thereinputNamelydecoderThe input and hidden layer ofLSTMThe resulting output .
context After a one-dimensional convolution operation , It is also equivalent to passing through a weight matrix W 1 W_{1} W1 obtain [bs, hidden_dim, seq_len] Dimensionally invariant matrix .input Through a weight matrix W 2 W_{2} W2 Extend dimension to [bs, hidden_dim, seq_len]. Then these two matrices are added together with one [bs, 1, hidden_dim] Learnable variables of V V V Multiply . obtain [bs, seq] Dimensional attention matrix . Corresponding to the original formula is :
u j i = v T t a n h ( W 1 e j + W 2 d i ) j ∈ ( 1 , ⋯ n ) u_{j}^{i} = v^{T} tanh(W_{1} e_{j} + W_{2}d_{i} ) \ \ \ j \in (1, \cdots n) uji=vTtanh(W1ej+W2di) j∈(1,⋯n)
Go over one floor softmax obtain attention matrix :
a j i = s o f t m a x ( u j i ) j ∈ ( 1 , ⋯ n ) a_{j}^{i} = softmax(u_{j}^{i}) \ \ \ j \in (1, \cdots n) aji=softmax(uji) j∈(1,⋯n)
This [bs, seq] Dimensional attention The matrix and dimension are [bs, hidden_dim, seq_len] Of context Multiply the matrix to get the output of the hidden layer [bs, hidden_dim], As LSTM The next moment of hidden state.
d i ′ = ∑ j = 1 n a j i e j d_{i}^{\prime} = \sum_{j=1}^{n} a_{j}^{i} e_{j} di′=j=1∑najiej
The result is ?
Published information ? The author information ?
Reference link
边栏推荐
- Database expansion can also be so smooth, MySQL 100 billion level data production environment expansion practice
- 【LeetCode】33、 搜索旋转排序数组
- R language wilcox The test function compares whether there is a significant difference in the central position of the population of two nonparametric samples (if the two sample data are paired data, s
- R language ggplot2 visualization: use ggplot2 to visualize the scatter diagram, and use the theme of ggpubr package_ The pubclean function sets the theme without axis lines in the visual image
- How to find the translation of foreign literature for undergraduate graduation thesis?
- Familiarize you with the "phone book" of cloud network: DNS
- R语言ggplot2可视化:使用ggpubr包的ggballoonplot函数可视化气球图(可视化由两个分类变量组成的列联表)、配置guides函数中的size参数指定不显示数据点大小的图例
- FOC learning notes - coordinate transformation and simulation verification
- The practice of software R & D should start from the design
- OSS deletes all files two days before the current time
猜你喜欢

【LeetCode】33、 搜索旋转排序数组

Parallel d-pipeline: a cuckoo hashing implementation for increased throughput

C # set different text watermarks for each page of word

Devsecops, speed and security

QCF for deep packet inspection论文总结

固态硬盘对游戏运行的帮助有多少

The IPO of shengtaier technology was terminated: it was planned to raise 560million yuan, and Qiming and Jifeng capital were shareholders

有哪些科研人员看文献必用的软件?

大学论文格式怎么写?

数商云:引领化工业态数字升级,看摩贝如何快速打通全场景互融互通
随机推荐
OSPF and mGRE experiments
In the changing era of equipment manufacturing industry, how can SCM supply chain management system enable equipment manufacturing enterprises to transform and upgrade
Vs add settings for author information and time information
什么是传输层协议TCP/UDP???
【5分钟Paper】Pointer Network指针网络
[static code quality analysis tool] Shanghai daoning brings you sonarource/sonarqube download, trial and tutorial
软测(七)性能测试(1)简要介绍
Simulation of character function and string function
[leetcode daily question] - 268. Missing numbers
Sqldeveloper tools quick start
解决Typora图片显示不出来问题
oss删除当前时间前两天的所有文件
Cve-2022-33891 vulnerability recurrence
广州地铁十三号线二期全线土建已完成53%,预计明年开通
The most detailed patent application tutorial, teaching you how to apply for a patent
Unity URP入门实战
Solve the problem that typora pictures cannot be displayed
Zhaoqi science and technology innovation high-end talent project was introduced and implemented, mass entrepreneurship and innovation competition was organized, and online live roadshow was broadcast
How to query foreign literature?
No module named ‘win32gui‘