当前位置:网站首页>The 19th - 22nd week of scientific research training - about tnet and memnet
The 19th - 22nd week of scientific research training - about tnet and memnet
2022-06-26 01:44:00 【Republic cake】
tags:
Paper reading
NLP
1. Paper reading
《Transformation Networks for Target-Oriented Sentiment Classifification》
[code]
《Aspect Level Sentiment Classification with Deep Memory Network》
- [paper]
- [code]
1.1 introduction
1.1.1 TNet
Put forward Transformation Networks (TNet) solve Attention The mechanism and CNN In itself ABSC A defect in the task ( See for specific defects 1.2)
Main contributions :
- Adopted CNN Of TNet solve ABSC problem , And the performance has achieved the best results on the benchmark data set .
- A new method based on objective target Conversion components for , To better integrate target information into word representation .
- A context saving mechanism is designed , Forward context information into the deep transformation architecture , therefore , This model can learn more abstract contextual word features from the deeper network .
1.1.2 MemNet
Put forward MemNet Model , utilize Attention Multi layer mechanism to memorize the text information of the context , Its speed ratio LSTM Much faster (cpu)
Main contributions :
- be based on Attention To capture context information , Get rid of LSTM Time consuming problem
- The efficiency and accuracy are greatly improved
1.2 motivation
1.2.1 TNet
- The defect of attention mechanism itself :
- When calculating the attention score of a word , Will introduce noise
explain : for instance ,“This dish is my favorite and I always get it and never get tired of it.” About the dish When calculating the attention mechanism , Introduce irrelevant words such as “never”“tired”
CNN Its own defects in text processing :
Can not be like RNN Explore the context information completely
Difficult to deal with multi-objective words ( Even if explored, it is difficult to make trade-offs based on context )
explain : Like a sentence “great food but the service was dreadful”, For the target word “food” for ,CNN It is difficult to give “good” and “dreadful” A reasonable choice between
1.2.2 MemNet
LSTM Time consuming , Weak parallelism , At the same time, it can not accurately capture the data based on aspect The context of
1.3 model
1.3.1 TNet

Combining with the above , Explain the model part from the bottom up :
x i x_i xi Is the original input , The red module is bidirectional LSTM,( Because previous studies have shown that context based word representation is an effective representation of words in convolution architecture ), Output h i h_i hi It is the result of the word representation in this layer . h i = [ L S T M l e f t − > r i g h t ( x i ) ; L S T M r i g h t − > l e f t ( x i ) ] , i ∈ [ 1 , n ] h_i=[LSTM_{left->right}(x_i);LSTM_{right->left}(x_i)],i∈[1,n] hi=[LSTMleft−>right(xi);LSTMright−>left(xi)],i∈[1,n]
The grey module is CPT( Context saving transformation ), The main function is to introduce target Come to the word expression No traditional attention Calculate the weight to do …… Instead, they used their own design TST modular , Emphasize at the same time CPT It's multi-layered
TST Components are introduced from bottom to top :
two-way LSTM Get the representation of the target word h τ h^{\tau} hτ
Take the next step h τ h^{\tau} hτ Dynamically with the words in the sentence w i w_i wi Connect , Variation customizes the target representation at the time step KaTeX parse error: Expected 'EOF', got '}' at position 11: r_i^{\tau}}̲
Fully connected layer , For the first i i i The expression of a target word
LF/AS:TST after , from Bi-LSTM The context information obtained will be lost , To take advantage of contextual information , Here are two strategies LF and AS
LF: Lossless forwarding , The model is red at first Bi-LSTM The generated representation and TST The generated representations are added directly
AS: Adaptive scaling , Introduce parameters W W W and b b b, How much learning model should be kept in red at the beginning Bi-LSTM The generated representation and how many TST Generated representation . It is equivalent to introducing the gating mechanism .

- CNN: Location code ( Suppose a target Multiple... Are detected around opinion When , Default that the closest contribution to him is the greatest )
1.3.2 MemNet

Parallelism is better than using LSTM A better model , It's actually multiple attention Stack of layers , Using multiple layers attention Capture relationships between contexts , In order to replace the use of such as LSTM The solution of this time series model .
1.4 Examination
TNet Just tried LF How to do it ,AS There are still problems in debugging , The accuracy does not reach a reasonable state
1.5 Result
Because I didn't find it twitter Of xml data …… So it is only given in laptop as well as restaurant The result on :
2. Coding The story of
bugs lovely
3. Reference material
twitter :
Long live the return to work ~~!!!
School starts to prepare for the fight between immortals ~ Go go go go
边栏推荐
猜你喜欢

Can bus transceiver principle

Textcnn paper Interpretation -- revolutionary neural networks for sense classification

浅谈接口测试(一)

What happens from entering a web address in the browser's input box to seeing the contents of the web page?

Abnova丨CSV 单克隆抗体解决方案

15 `bs object Node name Node name String` get nested node content

阳光男孩陈颢天 受邀担任第六季完美童模全球总决赛代言人

recv & send

Abnova丨ACTN4 DNA 探针解决方案

21. Hoff circle transformation
随机推荐
GUN make (3) Makefile的规则
Some summary of model compression
热血男孩滕文泽 受邀担任第六季完美童模全球总决赛形象大使
MySQL图书借阅系统项目数据库建库表语句(组合主键、外键设置)
APP测试(一)
物联网亿万级通信一站式解决方案EMQ
通过电脑获取WIFI密码(只能连接过的WiFi)
**MySQL例题一(根据不同问题,多条件查询)**
Postman接口测试之断言
胰蛋白酶的化学性质及应用
秀场精灵陈梓桐 受邀担任第六季完美童模全球总决赛首席体验官
Wechat circle of friends test point
Test questions and answers for the 2022 baby sitter (Level 5) examination
Installing MySQL databases in FreeBSD
JQ user defined attribute value
缓冲
20. Hough line transformation
--都市修炼手册之SQL-- 第一章 基础复习
28. contour discovery
GUN make (7) 执行make