当前位置:网站首页>Relationship extraction -- casrel
Relationship extraction -- casrel
2022-06-26 08:27:00 【xuanningmeng】
Relationship extraction –CASREL
Relation extraction is a basic task in naturallanguageprocessing . Relation extraction usually uses triples (subject, relation, object) Express . There are two ways to solve relationship extraction :
(1) Two entities are known subject and object, Use the classification model to get the relationship between entities
(2) Extract entities , Predict possible relationships between entities . If you first extract entities and then predict relationships , This is called pipline The formula extraction ; If we extract entities and relationships between entities at the same time , This method is called joint extraction .
The data set of relation extraction is complex , Entities and relationships in the dataset overlap , An ideal data set for relational extraction subject and object Corresponding to a relationship , Real data sets subject and object Corresponding to a variety of relationships , Entities overlap . Here's the picture :
among EPO Indicates that the entity is repeated ,SPO Represents a single entity repetition .
When relational triples (subject, relation, object) When overlapping , Relational classification models are difficult to handle overlapping data . If there are not enough training examples , It is difficult for classifiers to tell which relationship entities participate in , Extracted triples are usually incomplete and inaccurate . However CASREL The model can effectively deal with overlapping relational triples .
CASREL Model
novel cascade binary tagging framework(CASREL) The model is A Novel Cascade Binary Tagging Framework for Relational Triple Extraction Proposed ,CASREL The model has been refreshed SOTA Result .CASREL The model is divided into two steps :
(1) Through pre training BERT The model gets all the possible subject
(2) For each subject, We apply a relationship specific marker to identify all possible relationships and corresponding object.
The purpose of extracting relational triples is to identify all possible (subject,relation,object), Some of these relationships may be the same as sharing subject or object Entity . therefore CASREL The objective function of is expressed as :
CASREL The model structure is shown below 
Subject Tagger
Subject Tagger The model in is decoded directly N layer BERT The encoding vector generated by the encoder hN To identify all possible in the input sentence subject, In fact, two identical two classifiers are used (0/1) To mark subject Start and end of , The formula is as follows :
In a given sentence subject The maximum likelihood function of is :
Relation-specific Object Taggers
Relation-specific object taggers Considering subject Characteristics of , Instead of directly decoding pre training bert Model HN,relation-specific object taggers The formula is as follows :
Relation-specific object taggers The maximum likelihood function of is 
Chinese experimental results
On Chinese data and English data sets tokenizer The treatment is basically the same , Every Chinese character is followed by [unused1], use chinese_L-12_H-768_A-12 Pre training model . The format of Chinese dataset processing is as follows :
{
"text": " How to play your part well , Please read 《 Self-cultivation of actors 》《 The king of comedy 》 The unique secret collection of Stephen Chow rising from poverty ",
"triple_list": [
[
" The king of comedy ",
" starring ",
" Stephen Chow "
]
]
}
The model parameters are as follows :
max_length=128, batch_size=16, lr=1e-5, epoch=16
The evaluation results of the model are as follows :
f1: 0.7827, precision: 0.7736, recall: 0.7921, best f1: 0.7944
The prediction results of the model are as follows :
{
"text": "《 The magic show of love 》 It's a song sung by Anxia , Written by wuyiwei ,MartinHansen/StefanDouglasHayOsson Composing music , Included on album 《 Single best 》 in ",
"triple_list_gold": [
{
"subject": " The magic show of love ",
"relation": " The album ",
"object": " Single best "
},
{
"subject": " The magic show of love ",
"relation": " singer ",
"object": " Anxia "
}
],
"triple_list_pred": [
{
"subject": " The magic show of love ",
"relation": " The album ",
"object": " Single best "
},
{
"subject": " The magic show of love ",
"relation": " singer ",
"object": " Anxia "
},
{
"subject": " The magic show of love ",
"relation": " Lyrics ",
"object": " Wuyiwei "
}
],
"new": [
{
"subject": " The magic show of love ",
"relation": " Lyrics ",
"object": " Wuyiwei "
}
]
}
If there is an error , Welcome to correct .
边栏推荐
- Recyclerview item gets the current position according to the X and Y coordinates
- GHUnit: Unit Testing Objective-C for the iPhone
- Solve the problem that pychar's terminal cannot enter the venv environment
- drf的相关知识
- JWT in go
- opencv学习笔记二
- STM32 project design: an e-reader making tutorial based on stm32f4
- 1. error using XPath to locate tag
- [postgraduate entrance examination planning group] conversion between signed and unsigned numbers
- leetcode2022年度刷题分类型总结(十二)并查集
猜你喜欢

First character that appears only once

STM32 porting mpu6050/9250 DMP official library (motion_driver_6.12) modifying and porting DMP simple tutorial

Crawler case 1: JS reversely obtains HD Wallpapers of minimalist Wallpapers

HEVC学习之码流分析

Pychart connects to Damon database

MySQL query time period

Use of jupyter notebook

opencv學習筆記三

Installation of jupyter

Vs2019-mfc setting edit control and static text font size
随机推荐
Wifi-802.11 2.4G band 5g band channel frequency allocation table
Database learning notes I
Diode voltage doubling circuit
批量修改文件名
js文件报无效字符错误
STM32 encountered problems using encoder module (library function version)
Win11 open folder Caton solution summary
MySQL practice: 3 Table operation
51 single chip microcomputer project design: schematic diagram of timed pet feeding system (LCD 1602, timed alarm clock, key timing) Protues, KEIL, DXP
73b2d wireless charging and receiving chip scheme
GHUnit: Unit Testing Objective-C for the iPhone
Blue Bridge Cup 3 sequence summation
Macro task, micro task, async, await principle of interview
Introduction of laser drive circuit
Bluebridge cup 1 introduction training Fibonacci series
Delete dictionary from list
Assembly led on
Swift code implements method calls
Discrete device ~ resistance capacitance
Baoyan postgraduate entrance examination interview - Network