当前位置:网站首页>Reading of the paper "attentional encoder network for targeted sentimental classification"
Reading of the paper "attentional encoder network for targeted sentimental classification"
2022-07-28 04:27:00 【jst100】
Article address :https://arxiv.org/pdf/1902.09314.pdf
List of articles
Article content
Previously, emotional classification based on specific aspects of entities is mostly used RNN And attention mechanism , However RNN Henan parallelization , And when the sentence is too long, it also brings difficulties to long-term memory . Therefore, this question proposes an attention coding network AEN(Attentional Encoder Network), To model context and target entities . And the paper also raises the problem of label unreliability , Thus, label smoothing regularization is introduced .
Article method

Embedded layer
Here the author adopts two ways to realize , One is static word embedding Glove, That is, already trained , The other is BERT Model , The author's construction methods are 2 Kind of , For the context “[CLS] + context + [SEP]”, For the target entity “[CLS] + target
- [SEP]”. In addition, the author also made an example that only BERT The comparison model of is called BERT-SPC, Its structure is “[CLS] + context + [SEP] + target + [SEP]”.
Attentional layer
Here, the author draws lessons from the mechanism design of multi head attention 2 Ways of planting :
Intra-MHA, Here the attention mechanism k and v All for context That is context , Is the internal attention mechanism , The formula is as follows :
Inter-MHA Interactive attention mechanism ,q by context and k For the target entity , That is to learn the interaction between the target entity and the context , The formula is as follows :
Point-wise Convolution Transformation
stay MHA The author followed the output of a point by point convolution transform (PCT), So as to further extract the information of attention mechanism , Point by point means that the kernel is 1 Of CNN, The formula is as follows :
Target specific attention level
In addition to passing the internal and interactive attention mechanisms respectively PCT outside , The author also applies another MHA To get the target specific context representation ( Tell the truth , I don't quite understand the meaning of Physics ) The formula is as follows :
Output layer
The final output is the splicing of the three, and then an average pooling , The formula is as follows :
Label smoothing
The idea of label smoothing is to make the final comparison result of the model not 0 or 1 This hard label , It is 0.1,0.9 Labels like this , So as to achieve the purpose of regularization :
Here, the smoothing score designed by the author is the reciprocal of the number of categories , The final loss is calculated as follows :
边栏推荐
- [untitled]
- A little advice for students - how to build their own knowledge system?
- Space complexity calculation super full sorting!! (calculation of hand tearing complexity
- 10 more advanced open source command line tools
- Important SQL server functions - date functions
- Seamless support for hugging face community, colossal AI low-cost and easy acceleration of large model
- Machine learning 06: Decision Tree Learning
- Citrix virtual desktop tcp/udp transmission protocol switching
- [mathematical modeling] Based on MATLAB seismic exploration Marmousi model [including Matlab source code, 1977]
- Cloud native Devops status survey questionnaire solicitation: kodelurover launched jointly with oschina
猜你喜欢

un7.27:redis数据库常用命令。

23 openwrt switch VLAN configuration

Space complexity calculation super full sorting!! (calculation of hand tearing complexity

Information system project manager (2022) - key content: Project Contract Management (13)

重要的 SQL Server 函数 - 其他函数

Important SQL server functions - date functions

Important SQL server functions - numeric functions

High number_ Chapter 4__ curvilinear integral

虚拟机类加载机制

ESP8266 WIFI 模块和手机通信
随机推荐
Cloud native Devops status survey questionnaire solicitation: kodelurover launched jointly with oschina
Object locking in relational database transactions
Null security and exception
Solana's "deceptive behavior": making mobile phones and opening stores
高数_第4章__曲线积分
重要的 SQL Server 函数 - 字符串实用程序
《Intel Arria 10 Avalon-MM DMA Interface for PCI Express Solutions User Guide》文档学习
企业数字化建设“三不五要”原则
XML file usage and parsing
RN interface jump description
The simulation test disconnects the server from the public network
Campus stray cat information recording and sharing applet source code
Machine learning 06: Decision Tree Learning
Information system project manager (2022) - key content: Project Procurement Management (12)
idea启动项目mvn命令终端用不了法将“mvn”项识别为 cmdlet
Important SQL server functions - other functions
空间复杂度计算超全整理!!(一起手撕复杂度计算
[coding and decoding] Huffman coding and decoding based on Matlab GUI [including Matlab source code 1976]
金仓数据库KingbaseES安全指南--5.1. 数据库的传输安全
Password key hard coding check