当前位置:网站首页>Relationship extraction -- casrel
Relationship extraction -- casrel
2022-06-26 08:27:00 【xuanningmeng】
Relationship extraction –CASREL
Relation extraction is a basic task in naturallanguageprocessing . Relation extraction usually uses triples (subject, relation, object) Express . There are two ways to solve relationship extraction :
(1) Two entities are known subject and object, Use the classification model to get the relationship between entities
(2) Extract entities , Predict possible relationships between entities . If you first extract entities and then predict relationships , This is called pipline The formula extraction ; If we extract entities and relationships between entities at the same time , This method is called joint extraction .
The data set of relation extraction is complex , Entities and relationships in the dataset overlap , An ideal data set for relational extraction subject and object Corresponding to a relationship , Real data sets subject and object Corresponding to a variety of relationships , Entities overlap . Here's the picture :
among EPO Indicates that the entity is repeated ,SPO Represents a single entity repetition .
When relational triples (subject, relation, object) When overlapping , Relational classification models are difficult to handle overlapping data . If there are not enough training examples , It is difficult for classifiers to tell which relationship entities participate in , Extracted triples are usually incomplete and inaccurate . However CASREL The model can effectively deal with overlapping relational triples .
CASREL Model
novel cascade binary tagging framework(CASREL) The model is A Novel Cascade Binary Tagging Framework for Relational Triple Extraction Proposed ,CASREL The model has been refreshed SOTA Result .CASREL The model is divided into two steps :
(1) Through pre training BERT The model gets all the possible subject
(2) For each subject, We apply a relationship specific marker to identify all possible relationships and corresponding object.
The purpose of extracting relational triples is to identify all possible (subject,relation,object), Some of these relationships may be the same as sharing subject or object Entity . therefore CASREL The objective function of is expressed as :
CASREL The model structure is shown below 
Subject Tagger
Subject Tagger The model in is decoded directly N layer BERT The encoding vector generated by the encoder hN To identify all possible in the input sentence subject, In fact, two identical two classifiers are used (0/1) To mark subject Start and end of , The formula is as follows :
In a given sentence subject The maximum likelihood function of is :
Relation-specific Object Taggers
Relation-specific object taggers Considering subject Characteristics of , Instead of directly decoding pre training bert Model HN,relation-specific object taggers The formula is as follows :
Relation-specific object taggers The maximum likelihood function of is 
Chinese experimental results
On Chinese data and English data sets tokenizer The treatment is basically the same , Every Chinese character is followed by [unused1], use chinese_L-12_H-768_A-12 Pre training model . The format of Chinese dataset processing is as follows :
{
"text": " How to play your part well , Please read 《 Self-cultivation of actors 》《 The king of comedy 》 The unique secret collection of Stephen Chow rising from poverty ",
"triple_list": [
[
" The king of comedy ",
" starring ",
" Stephen Chow "
]
]
}
The model parameters are as follows :
max_length=128, batch_size=16, lr=1e-5, epoch=16
The evaluation results of the model are as follows :
f1: 0.7827, precision: 0.7736, recall: 0.7921, best f1: 0.7944
The prediction results of the model are as follows :
{
"text": "《 The magic show of love 》 It's a song sung by Anxia , Written by wuyiwei ,MartinHansen/StefanDouglasHayOsson Composing music , Included on album 《 Single best 》 in ",
"triple_list_gold": [
{
"subject": " The magic show of love ",
"relation": " The album ",
"object": " Single best "
},
{
"subject": " The magic show of love ",
"relation": " singer ",
"object": " Anxia "
}
],
"triple_list_pred": [
{
"subject": " The magic show of love ",
"relation": " The album ",
"object": " Single best "
},
{
"subject": " The magic show of love ",
"relation": " singer ",
"object": " Anxia "
},
{
"subject": " The magic show of love ",
"relation": " Lyrics ",
"object": " Wuyiwei "
}
],
"new": [
{
"subject": " The magic show of love ",
"relation": " Lyrics ",
"object": " Wuyiwei "
}
]
}
If there is an error , Welcome to correct .
边栏推荐
- Crawler case 1: JS reversely obtains HD Wallpapers of minimalist Wallpapers
- STM32 project design: an e-reader making tutorial based on stm32f4
- 73b2d wireless charging and receiving chip scheme
- static const与static constexpr的类内数据成员初始化
- JS precompile - Variable - scope - closure
- Example of offset voltage of operational amplifier
- Delete dictionary from list
- 1002: easy to remember phone number
- MySQL practice: 4 Operation of data
- ZLMediaKit推流拉流测试
猜你喜欢

MySQL practice: 1 Common database commands

static const与static constexpr的类内数据成员初始化

leetcode2022年度刷题分类型总结(十二)并查集

Flume learning notes

Apple motherboard decoding chip, lightning Apple motherboard decoding I.C

Example of offset voltage of operational amplifier

Design based on STM32 works: multi-functional atmosphere lamp, wireless control ws2812 of mobile app, MCU wireless upgrade program

Read excel table and render with FileReader object
![[postgraduate entrance examination] group planning exercises: memory](/img/ac/5c63568399f68910a888ac91e0400c.png)
[postgraduate entrance examination] group planning exercises: memory

(5) Matrix key
随机推荐
[untitled]
Leetcode22 summary of types of questions brushing in 2002 (XII) and collection search
How to Use Instruments in Xcode
Win11 open folder Caton solution summary
MySQL practice: 4 Operation of data
JS precompile - Variable - scope - closure
在 KubeSphere 部署 Wiki 系统 wiki.js 并启用中文全文检索
Calculation of decoupling capacitance
2020-10-20
(5) Matrix key
What is Qi certification Qi certification process
opencv学习笔记二
CodeBlocks集成Objective-C开发
Project practice: parameters of pycharm configuration for credit card digital recognition and how to use opencv in Anaconda
Use of jupyter notebook
js文件报无效字符错误
Quickly upload data sets and other files to Google colab ------ solve the problem of slow uploading colab files
51 single chip microcomputer project design: schematic diagram of timed pet feeding system (LCD 1602, timed alarm clock, key timing) Protues, KEIL, DXP
[postgraduate entrance examination: planning group] clarify the relationship among memory, main memory, CPU, etc
See which processes occupy specific ports and shut down