当前位置:网站首页>30. Feed shot named entity recognition with self describing networks reading notes
30. Feed shot named entity recognition with self describing networks reading notes
2022-07-07 12:07:00 【Smoked Luoting purple Pavilion】
Author Information:,
,
,
,
Institutions Information:
1. Chinese Information Processing Laboratory
2. State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China
3.University of Chinese Academy of Sciences, Beijing, China
4. Beijing Academy of Artifificial Intelligence, Beijing, China
ACL 2022
Model abbreviation :SDNet
Catalog
3. Self-describing Networks for FS-NER
3.2 Entiry Recognition via Entity Generation
3.3 Type Description Construction via Mention Describing
Pre-training via Mention Describing and Entity Generation
4.2 Entity Recognition Fine-tuning
Abstract
Small sample named entity recognition (Few-shot NER) Need to accurately capture information from limited examples , And can transfer useful knowledge from external resources . this paper , For small samples NER, We propose a self describing mechanism (self-describing mechanism), This mechanism can effectively use illustrative examples , And describe entity types and entities by using general concept sets mention (mention), Accurately transfer knowledge from external resources . say concretely , We designed Self-describing Networks (SDNet), The network is a Seq2Seq Generate models , It can be commonly used to describe concepts mention, Automatically insinuate new entity types into concepts , And adaptively recognize on-demand entities .
We use large-scale corpus for pre training SDNet, And in 8 Experiments were carried out on benchmark data sets in different fields . Experimental results show that SDNet stay 8 On benchmark data sets , Have achieved good performance , And in 6 Data sets were obtained SOTA Performance of . This also proves the effectiveness and robustness of this method .(effectiveness and robustness)
Mention It is usually defined as References to entities in natural language texts , The entity can be named (named) Entity 、 name (nominal) Entity or pronoun (pronominal) Entity .
1. Introduction
FS-NER (Few-shot NER) The purpose is to identify the entity reference corresponding to the new entity type through several examples (entity mention).FS-NER It is a promising open domain NER technology , It contains various unforeseen types and very limited examples , Therefore, it has attracted extensive attention in recent years
FS-NER The main challenge is how to use a small number of examples to accurately model the semantics of invisible entity types . To achieve this goal ,FS-NER It is necessary to effectively capture information from a small number of samples , meanwhile , Develop and transfer useful knowledge from external resources .
Challenge one :limited information challenge
The information contained in the illustrative example is very limited ,
Challenge two :knowledge mismatch challenge
external Knowledge usually does not directly match new tasks , Because it may contain irrelevant 、 Heterogeneous 、 Even conflicting knowledge .
So , This paper puts forward a kind of FS-NER Self describing mechanism of . The main idea behind the self description mechanism is , All entity types can be described with the same set of concepts , And the mapping between types and concepts can be generally modeled and learned . such , You can use the same set of concepts , Solve the problem of knowledge mismatch by uniformly describing different entity types .
for example :
Below 1 in , Different topic types (types) Will match the same set of concepts (e.g., park, garden, country……), therefore , Knowledge from different sources can be universally described and transferred . Besides , Because concept mapping is universal , A few examples are only used to construct mappings between new types and concepts , Therefore, it can effectively solve the limited information problem .
Based on the above thought , We propose a self describing network SDNet, This is a kind of Seq2Seq Generation network , It can generally use concepts to describe references , Automatically map new entity types to concepts , And adaptively recognize entities on demand . say concretely ,1) In order to capture the mention (mention) Semantic information ,SDNet Generate a set of general concepts as its description .2) To map entity types to concepts ,SDNet Generate and integrate mentioned concept descriptions with the same entity type . 3) To identify entities ,SDNet Through a prefix prompt with rich concepts (prefix prompt) Directly generate all entities in the sentence , The prompt contains the target entity type and its conceptual description .
Because this concept set is universal , So we are on a large scale 、 Easy to access web Resource right SDNet Pre training . To be specific , We collected a link , It contains a pre trained data set , It includes 56M A sentence , exceed 31K A concept .
SDNet By projecting references and entity types into a general concept space , It can effectively enrich entity types to solve limited information problems , Generally speaking, different patterns are used to solve the problem of knowledge mismatch , And can effectively carry out unified pre training . Besides , Prefix prompts are used for the above tasks (prefix prompt) Mechanism modeling single generation model , Distinguish between different tasks , Make the model controllable 、 Universal , Can train continuously .
We are 8 Different areas FS-NER Experiments were carried out on the benchmark . Experiments show that ,SDNet It has very strong performance , And in 6 Data sets were obtained SOTA performance .
The main contribution of this paper is :
- We put forward a kind of FS-NER Self describing mechanism of , Describe entity types and references by using a common set of concepts , It can effectively solve the challenges of limited information and knowledge mismatch .
- We propose a self describing network SDNet, This is a kind of Seq2Seq Generation network , It can generally use concepts to describe references , Automatically map new entity types to concepts , And recognize entities adaptively as needed .
- We are working on large-scale open data sets SDNet Pre training , This is Fs-NER Provides a general knowledge , And can benefit many future NER Research .
2. Related Work
3. Self-describing Networks for FS-NER
In this section , We will describe how to use self description networks to build a small number of entity recognizers and identify entities . chart 3(b) It shows the whole process . Concrete , In two parts
1)Mention describing
Generate mention A conceptual description of
2)Entity generation ( Entity generation )
Adaptively generate entity references corresponding to ideal new types (entity mention).
Use SDNet,NER You can enter a type description during entity generation , Direct execution NER. Given a new type , Its type description is established by referring to and describing its illustrative examples . below , We will first introduce SDNet, Then describe how to build the type description and build few-shot Entity recognizer .
3.1 Self-describing Networks
SDNet There are two main generation tasks :mention describing and Entity generation. The mentioned description is to generate the mentioned concept description , Entity generation is to generate entity references adaptively . To guide the above two processes ,SDNet Used a different prompts P, Generate different outputs Y. Pictured 2 Shown :
Yes mention describing for , prompt Contains a task descriptor [MD], And the target entity mention. For entity recognition ,prompt Contains a task descriptor [EG], And a series of new entities and their corresponding descriptions .
Input : prompt P And sentences sentence S ( = P
S)
Output :SDNet Will generate a sequence Y,Y Include reference descriptions (mention describing) And entity generation (entity generation) result .
We can see that , Using the above two generations of processes can effectively execute few-shot Entity recognition . For entity recognition , We can put the description of the target entity type at the prompt prompt in , Then the entity is generated adaptively through the entity generation process . To construct a new type of entity recognizer , We only need its type description , By summarizing the conceptual description of its illustrative examples , Can effectively build its type description .
3.2 Entiry Recognition via Entity Generation
stay SDNet in , Entity recognition is performed by entity generation , The given entity generation prompt is And sentences
.
Illustrate with examples :
example :" Harry Potter is written by J.K. Rowling.”
1) identify entity of PERSON type
Input format :{[EG] person: {actor, writer}}
SDNet Generate results :“J.K. Rowling is person”
2) identity entity of CREATIVE_WORK type
Input format :{[EG] creative_work: {book, music}}
result :“Harry Potter is creative_work”
3.3 Type Description Construction via Mention Describing
In order to build a type description of a new type with several illustrative examples ,SDNet First, a description of each of the concepts mentioned in the illustrative example is obtained by referring to the description . Then by summarizing all the concept descriptions , Build type descriptions for each type .
Mention Describing
Input :, take
and X Splice as input .
Output :“ is
, ...,
;
is
, ...,
;...”
among , It means the first one i-th The entity refers to section j-th Concept .
Type Description Construction
then SDNet Summarize the generated concepts to describe the precise semantics of specific new types . say concretely , All of the same type t The concept description of will be integrated into C, As type t Description of . And construct a type description M={(t,C)}. Then merge the constructed type description into in , To guide the generation of entities .
Filtering Strategy
Due to the diversification of downstream novel types ,SDNet There may not be enough knowledge to describe these types , So force SDNet Describing them may lead to inaccurate descriptions . To solve this problem , We have put forward Filtering Strategy, bring SDNet Can refuse to generate unreliable descriptions .
say concretely , For uncertain instances ,SDNet It was born into other class . Given a new type and some illustrative examples , We will calculate in the conceptual description of these examples other Frequency of instances . If generated on an illustrative instance other The frequency of instances is greater than 0.5, We will delete the type description , And directly use the type name as . We will be in the 4.1 Section describes SDNet How to learn filtering strategies .
4. Learning
4.1 SDNet Pre-training
This article USES the wikipedia and wikidata As an external knowledge source . ( This article USES the 20210401version Of wikipedia )
Entity Mention Collection
SDNet Pre training , Collection required <e,T,X>, among ,e Is an entity reference ,T The entity type is ,X It's a sentence .
e.g., <J.K. Rowling; person, writer, ...; J.K. Rowling writes ...>
Total obtained 31K type .
Type Descriptioon Building
For training SDNet, We need concept description , among
,
It's the type
Related concepts of . This article uses the entity types collected above as concepts , And build the following type description . Given an entity type , We collect all its concurrent entity types as its description concepts . such , For each entity type , We all have a set of concepts to describe . Because some entity types have a very large set of description concepts , In efficiency pre training, we randomly sample no more than N( this paper N take 10 individual ) The concept of .
Pre-training via Mention Describing and Entity Generation
Given the sentence and its reference - Type Tuples :, among
It's No i-th Set of types mentioned by entities .
yes
Of j-th type .
It's a sentence X Entities mentioned in . then , We construct type descriptions , And convert these triples into pre training examples .
4.2 Entity Recognition Fine-tuning
As mentioned above ,SDNet You can use manually designed type descriptions to directly identify entities . however SDNet You can also use illustrative instances to automatically build type descriptions , And further improve through fine tuning . say concretely For a given <e,T,X>, We first construct different types of descriptions , Next, construct an entity generation Prompt, Then generate the sequence , Through the optimization formula (2) To fine tune SDNet
5 Experiment
5.1 Settings
Datasets
Baselines
1) BERT-base
2)T5-base
3)T5-base-prompt: Tips prompt Of T5 Base version , Use the entity type as a hint
4)T5-base-DS
5) RoBERTa-based
6) Prototypical network based RoBERTa model Proto and its distantly supervised pre-training version Proto-DS
7) MRC model SpanNER which needs to design the description for each label and its distantly supervised pre-training version SpanNER-DS
5.2 Main Results
Conclusion :
1) By comparing NER Knowledge is generally modeled and pre trained , Self describing networks can effectively handle few-shotNER knowledge .
2) Due to limited information , Transfer external knowledge to FSNER Models are crucial .
3) Due to the mismatch of knowledge , Effectively transferring external knowledge to new downstream types is challenging .
5.3 Effects of Shot Size
Conclusion :
1)SDNet You can get better performance under all different lens settings . Besides , These improvements are more significant in the low lens setting , This proves that SDNet The intuition behind it .
2) The model based on generation is usually better than that based on classifier BERT The model has better performance . We think , This is because the generated model can capture the semantics of types more effectively by using tag discourse , Therefore, better performance can be achieved , Especially in low lens settings .
3) except Res Outside ,SDNet The performance on almost all data sets is significantly better than T5, This shows the effectiveness of the proposed self description mechanism .
5.4 Ablation Study
Conclusion :
1) Type description for SDNet Transferring knowledge and capturing type semantics are essential .
2) Joint learning process in description and entity generation network is an effective method to capture type semantics .
3) Filtering strategy can effectively alleviate the transfer of mismatched knowledge .
边栏推荐
- Camera calibration (2): summary of monocular camera calibration
- 2022 年第八届“认证杯”中国高校风险管理与控制能力挑战赛
- 防红域名生成的3种方法介绍
- Completion report of communication software development and Application
- About how to install mysql8.0 on the cloud server (Tencent cloud here) and enable local remote connection
- Fleet tutorial 19 introduction to verticaldivider separator component Foundation (tutorial includes source code)
- Flet教程之 18 Divider 分隔符组件 基础入门(教程含源码)
- EPP+DIS学习之路(2)——Blink!闪烁!
- powershell cs-UTF-16LE编码上线
- 【数据聚类】基于多元宇宙优化DBSCAN实现数据聚类分析附matlab代码
猜你喜欢
Hi3516 full system type burning tutorial
MATLAB实现Huffman编码译码含GUI界面
即刻报名|飞桨黑客马拉松第三期盛夏登场,等你挑战
【最短路】ACwing 1127. 香甜的黄油(堆优化的dijsktra或spfa)
清华姚班程序员,网上征婚被骂?
Solve the problem that vscode can only open two tabs
数据库系统原理与应用教程(010)—— 概念模型与数据模型练习题
108. Network security penetration test - [privilege escalation 6] - [windows kernel overflow privilege escalation]
Nuclear boat (I): when "male mothers" come into reality, can the biotechnology revolution liberate women?
111.网络安全渗透测试—[权限提升篇9]—[Windows 2008 R2内核溢出提权]
随机推荐
112. Network security penetration test - [privilege promotion article 10] - [Windows 2003 lpk.ddl hijacking rights lifting & MSF local rights lifting]
MATLAB實現Huffman編碼譯碼含GUI界面
Fleet tutorial 19 introduction to verticaldivider separator component Foundation (tutorial includes source code)
Time bomb inside the software: 0-day log4shell is just the tip of the iceberg
核舟记(一):当“男妈妈”走进现实,生物科技革命能解放女性吗?
【最短路】Acwing1128信使:floyd最短路
Visual Studio 2019 (LocalDB)\MSSQLLocalDB SQL Server 2014 数据库版本为852无法打开,此服务器支持782版及更低版本
【神经网络】卷积神经网络CNN【含Matlab源码 1932期】
110. Network security penetration test - [privilege promotion 8] - [windows sqlserver xp_cmdshell stored procedure authorization]
Present pod information to the container through environment variables
Flet教程之 17 Card卡片组件 基础入门(教程含源码)
超标量处理器设计 姚永斌 第9章 指令执行 摘录
Rationaldmis2022 advanced programming macro program
Fleet tutorial 14 basic introduction to listtile (tutorial includes source code)
[full stack plan - programming language C] basic introductory knowledge
Camera calibration (1): basic principles of monocular camera calibration and Zhang Zhengyou calibration
一起探索云服务之云数据库
Rationaldmis2022 array workpiece measurement
zero-shot, one-shot和few-shot
Unity中SmoothStep介绍和应用: 溶解特效优化