当前位置:网站首页>Li Hongyi 2020 deep learning and human language processing dlhlp core resolution-p21
Li Hongyi 2020 deep learning and human language processing dlhlp core resolution-p21
2022-07-27 06:52:00 【Haulyn5】
Preface
This article is Li Hongyi 2020 Deep learning and human language processing DLHLP In the course Coreference Resolution Notes referring to the digestion part .
https://speech.ee.ntu.edu.tw/~hylee/dlhlp/2020-spring.html Course website
Mr. Li Hongyi 2020 New course deep learning and human language processing _ Bili, Bili _bilibili -p21 Bili Bili handling
The content needs to be sorted out and improved .
Text

Picture source : Li Hongyi Deep learning and human language processing
Coreference Resolution Compared with the classification introduced in the previous course, it belongs to a special task .
https://web.stanford.edu/~jurafsky/slp3/ Speech and Language Processing (3rd ed. draft)
Anaphora digestion : Zhang San stared at Li Si, raised his fist and said :“ It doesn't hammer down the high wall today , He doesn't put it down .”, What does he and it mean respectively ?
In the above example , Zhang San , Li Si , He belongs to Mention, Zhang San and he refer to the same object , be called corefer, Li Si only refers to once , be called singleton,“ Zhang San ” Appear first , be called antecedent( Antecedent ),“ He ” Later , be called anaphor.
More detailed task description :
1. All the mentions are labeled. (Sometimes singletons are ignored).
2. All mentions are grouped into clusters.
such as :Cluster1:{ Zhang San , He } Cluster2:{ His fist , it , it }
Model framework :
step1:Mention Detection: Put one span Into the model , Two classification , Whether it is Mention. If a sequence is long N individual token, Then there are N(N-1)/2 A possibility span.
step2: Mention Pair Detection: Put what you got in front Mention Two by two into the model , Classify and judge whether it is a pair .
End to End Methods : Two span Scan input , Then judge two span Is it all mention, And it's a pair . Complexity is N^4 .
If you want to Coreference Resolution Problem solving , need Mention Ranking Model, See https://web.stanford.edu/~jurafsky/slp3/22.pdf

Picture source : Li Hongyi Deep learning and human language processing
adopt BERT Wait to take it out Embedding, Then two Span Extract features , Judge whether they are Mention, And whether these two are a pair , Combine the three scores ( Usually it's adding , It can also be sent to neural network , But didn't do it ), The result beyond the threshold is a mention pair. Don't worry about one span Can't see the overall situation ,BERT Extracting Embedding It's about the overall situation .
How to extract Span Representation: take Span Each token Of Embedding Into an attention network , Get a weight , Finally, each Embedding To sum by weight , Get a final Embedding, Next this Embedding With the first and last token Of Embedding,3 individual Embedding do Concatenation, The resulting vector is Span Representation.
In practice ,train yes End2End, however Infer When , You may do it first Mention Extraction of ,Mention Score enough to do Pair The calculation of . There are also some technologies, such as restrictions Mention Length etc. , Reduce the amount of computation .
Seq2Seq, One ChatBot Methods , for example : Input : How tall is Messi ? Output : Officially speaking, he is tall 5 feet 7 Inch . Input : He and C Who scored more goals ? At this time, the machine should know “ He ” Who is it? . and Seq2Seq The solution is , Enter three sentences , Output : Messi and C Who scored more goals ? This is a new solution .
Advanced topics :
Global Information: Sometimes the machine handles Mr.Lee Lee She Pairing , But you will know with background knowledge Mr.Lee and She It must not be a Cluster Inside .
Unsupervised: Cover the pronouns in the sentence , Give Way BERT Guess which pronoun is . A lot of problems , such as he It's a token, After digestion 2 individual token The situation of ?
边栏推荐
猜你喜欢

MySql数据库

shell--条件语句(if语句、case语句)

shell--变量的运算

NFS简介和配置

Iptables firewall

Many of the world's top 500 enterprises gathered at the second digital Expo, and the digital industry curtain is about to open!

Express接收请求参数

Alibaba cloud SMS authentication third-party interface (fast use)

Project training experience 2

NFS introduction and configuration
随机推荐
Linux Installation and uninstallation of MySQL
向日葵全面科普,为你的远程控制设备及时规避漏洞
Relevant preparation materials for system design
shell的编程规范and重定向与管道操作
DNS域名解析服务
Raid explanation and configuration
PSI | CSI and ROC | AUC and KS - memorandum
Redis fast learning
FTX US推出FTX Stocks,向主流金融行业迈进
网站服务器被攻击怎么办?向日葵提示防范漏洞是关键
RAID详解与配置
ESXI虚拟机启动,模块“MonitorLoop”打开电源失败
如何避免漏洞?向日葵远程为你讲解不同场景下的安全使用方法
NAT(网络地址转换)
For redis under windows, it can only read but not write
PXE高效批量网络装机
事件捕获方式和冒泡方式—它们的区别是什么?
3D打印品牌的康复骨科支具有何特别之处?
Ftx.us launched stock and ETF trading services to make trading more transparent
ES6的新特性(2)