当前位置:网站首页>Paper reading (62):pointer networks
Paper reading (62):pointer networks
2022-07-29 07:18:00 【Inji】
List of articles
1 introduce
1.1 subject
2015 Pointer network (Pointer networks)
1.2 Code
Github:https://github.com/shirgur/PointerNet
1.3 Abstract
A new neural architecture is introduced to Learn the conditional probability of the output sequence , Its elements are discrete markers corresponding to positions in the input sequence . Some existing methods , For example, sequence to sequence and neural Turing machines cannot deal with this problem , Because in every step of their output , The number of target classes depends on the length of the input , But the length of the input sequence is variable . Sorting of variable size sequences , And all kinds of combinatorial optimization problems belong to this kind of problem .
In particular , Variable size output dictionary By introducing Attention mechanism To solve . It differs from previous attention in , It does not use attention to fuse the hidden units of the encoder into context vectors in each decoder step , It is Use attention as a pointer to select the members of the input sequence as the output . Such an architecture is called Pointer network (Ptr-Net).
In order to prove Ptr-Net The effectiveness of the , Three famous geometric problems are used , For finding plane convex hull 、 Calculation Delaunay Triangulation and plane traveling salesman problem .Ptr-Net It not only improves sequence to sequence by inputting attention , And it can also be extended to variable size output dictionaries . The final results show that the generalization ability of the learning models exceeds the maximum length of their training . These results will encourage neural learning to explore a wider range of discrete problems .
1. Bib
@article{
Vinyals:2015:pointer,
author = {
Oriol Vinyals and Meire Fortunato and Navdeep Jaitly},
title = {
Pointer networks},
journal = {
Advances in neural information processing systems},
volume = {
28},
year = {
2015}
}
2 Model
First of all, two background works are introduced , That is, sequence to sequence and input attention model . Then it leads to the Ptr-Net framework .
2.1 Sequence to sequence model
Given a training pair ( P , C P ) (\mathcal{P},\mathcal{C^P}) (P,CP), Sequence to sequence The model uses parameters θ \theta θ Of RNN Calculate the conditional probability based on the probability chain rule :
p ( C P ∣ P ; θ ) = ∏ i = 1 m ( P ) p ( C i ∣ C 1 , … , C i − 1 , P ; θ ) (1) \tag{1} p(\mathcal{C^P}|\mathcal{P};\theta)=\prod_{i=1}^{m(\mathcal{P})}p(C_i|C_1,\dots,C_{i-1},\mathcal{P};\theta) p(CP∣P;θ)=i=1∏m(P)p(Ci∣C1,…,Ci−1,P;θ)(1) A schematic is as follows chart 1.
边栏推荐
- MySQL advanced (Advanced) SQL statement (I)
- Flink real-time warehouse DWD layer (Kafka associated with MySQL lookup join) template code
- 我的个人网站不让接入微信登录,于是我做了这个
- [redis] redis development specifications and precautions
- 怎么会不喜欢呢,CICD中轻松发送邮件
- Clock tree synthesis (I)
- gin 服务退出
- Flink real-time warehouse DWD layer (transaction domain - additional purchase dimension degradation processing) template code
- MySQL 使用客户端以及SELECT 方式查看 BLOB 类型字段内容总结
- [C language brush leetcode] 1054. Bar code with equal distance (m)
猜你喜欢
WPF简单登录页面的完成案例
Connecting PHP 7.4 to Oracle configuration on Windows
CMOS芯片制造全工艺流程
Vite3.0都发布了,你还能卷得动吗(新特性一览)
JS chicken laying eggs and egg laying chickens. Who appeared earlier, object or function? Is function an instance of function?
Flink real-time warehouse DWD layer (transaction domain - additional purchase dimension degradation processing) template code
Improved pillar with fine grained feature for 3D object detection paper notes
WPF interface layout must know basis
CAN&CANFD综合测试分析软件LKMaster与PCAN-Explorer 6分析软件的优势对比
spark学习笔记(七)——sparkcore核心编程-RDD序列化/依赖关系/持久化/分区器/累加器/广播变量
随机推荐
如何使用gs_expansion扩展节点
gin 模版
[OpenGL] use of shaders
怎么会不喜欢呢,CICD中轻松发送邮件
ERROR 1045 (28000) Access denied for user ‘root‘@‘localhost‘解决方法
[solution] error: lib/bridge_ generated. dart:837:9: Error: The parameter ‘ptr‘ of the method ‘FlutterRustB
Some tips of vim text editor
Round avatar of user list and follow small blocks
Homebrew brew update 长时间没反应(或卡在 Updating Homebrew...)
WPF interface layout must know basis
Latest 10 billion quantitative private placement list
暑期总结(二)
My personal website doesn't allow access to wechat, so I did this
城市花样精~侬好!DESIGN#可视化电台即将开播
It's enough for MySQL to have this article (disgusting and crazy typing 37k words, just for Bo Jun's praise!!!)
vue-router路由缓存
Gin service exit
Cvpr2021 | multi view stereo matching based on self supervised learning (cvpr2021)
Flink real-time warehouse DWD layer (order placing multiple tables to realize join operation) template code
Student achievement ranking system based on C language design