当前位置:网站首页>CRF (conditional random field) learning summary
CRF (conditional random field) learning summary
2022-06-30 09:46:00 【A grain of sand in the vast sea of people】
1. Why CRF( Conditional random field )
If you use softmax Classify every frame in the sequence . There is no direct consideration of the output context
CRF It is mainly used for sequence annotation , It can be simply understood as Classify every frame in the sequence , Since it's classification , It is natural to think of this sequence as CNN perhaps RNN After coding , Connected to a full connection layer softmax Activate , As shown in the figure below

Conditional random field
However , When we design labels , For example, use s、b、m、e Of 4 A label to do word segmentation of word tagging , The target output sequence itself carries some context , such as s You can't take it back m and e, wait . Per label softmax This output level context is not considered , So it means putting these associations at the coding level , I hope the model can learn these contents by itself , But sometimes “ A strong model ”.
and CRF More directly , it The output level associations are separated , This makes the model more “ Leisurely ”:

CRF Context is explicitly considered at the output
2. What is? CRF
Of course , If you just import the output correlation , It's not just CRF All ,CRF The real delicacy of , It's it In units of path , Consider the probability of the path .
2.1 Model profile
If an input has nn frame , The label of each frame has kk A possibility , So in theory, there is knkn Different kinds of output . We can use the following network diagram for simple visualization . In the following illustration , Each point represents the possibility of a label , The line between points indicates the association between labels , And each annotation result , All correspond to a complete path on the graph .

4tag Output network diagram in word segmentation model
In the sequence annotation task , Our correct answer is generally the only one . such as “ it's a nice day today ”, If the corresponding participle result is “ today / The weather / No / wrong ”, So the target output sequence is bebess, Other paths do not meet the requirements . In other words , In the sequence tagging task , The basic unit of our research should be the path , What we have to do , It's from knkn Choose the right path , That means , If we regard it as a classification problem , Then it will be knkn The classification problem of choosing one of the classes !
This is frame by frame softmax and CRF Is fundamentally different : The former regards sequence annotation as n individual k classification problem , The latter regards sequence annotation as 1 individual
classification problem .
3. Two marking modes in the sequential annotation model
3.1 SBME Tagging
S Express A word that represents a single word (single word) , B The beginning of a word (begin) Namely the first character ,M In the middle of a word (Middle) The middle word of a word ,E It means the end of a word (end), That is, the last word It is usually expressed in numbers :
# -1, unknown # 0-> 'S' # 1-> 'B' # 2-> 'M' # 3-> 'E'
Examples : I love to use Xiaomi mobile phone to play king glory -> I <S> Love <S> send <B> use <E> Small <B> rice <M> hand <M> machine <E> play <S> king <B> person <M> Rong <M> Yao <E>
3.2 CS Tagging
C Express The current word char And the last word char Is a continuous , Together they mean a word ,S Express Current word char And the last word char It's two different words . It is usually expressed in numbers : #-1,unkonwn # 0 -> 'C' # 1 -> 'S'
Examples : I love to use Xiaomi mobile phone to play king glory -> I <S> Love <S> send <C> use <S> Small <C> rice <C> hand <C> machine <S> play <S> king <C> person <C> Rong <C> Yao <S>
3.3. IOB Inside-outside-beginning (tagging)
IOB Inside-outside-beginning (tagging)
IOB It's a marking technique ,IOB foramt It is a symbol commonly used in computer linguistics (tokens) In the form of .
B The prefix refers to the beginning of a statement block ;I The prefix refers to the statement block (chunk) among ;O Prefix refers to not in this statement block .
B A tag is only one tag that closely follows another tag of the same type, but there is no... Between two tags O Use when marking .O The tag shows that the symbol does not belong to any statement block .
An example with IOB format:
Alex I-PER is O going O to O Los I-LOC Angeles I-LOC in O California I-LOCAlex is going to Los Angeles in California
I-PER O O O I-LOC I-LOC O I-LOC
Notice how "Alex", "Los" and "California", although first tokens of their chunk, have the "I-" prefix.
Another example
Alex I-PER going O Los I-LOC Angeles I-LOC California B-LOC
Notice how "California" now has the "B-" prefix, because it immediately follows another LOC chunk.
3.4. IOB2 format
Another similar format which is widely used is IOB2 format, which is the same as the IOB format except that the B- tag is used in the beginning of every chunk (i.e. all chunks start with the B- tag).
Example
Alex B-PER is O going O to O Los B-LOC Angeles I-LOC in O California B-LOC
3.5. BIOES
Related tagging schemes sometimes include "START/END: This consists of the tags B, E, I, S or O where S is used to represent a chunk containing a single token. Chunks of length greater than or equal to two always start with the B tag and end with the E tag."[4]
Other Tagging Scheme's include BIOES/BILOU, where 'E' and 'L' denotes Last or Ending character is such a sequence and 'S' denotes Single element or 'U' Unit element.
Alex S-PER is O going O with O Marty B-PER A. I-PER Rick E-PER to O Los B-LOC Angeles E-LOC
Reference resources
Wikipedia: Inside-outside-beginning
Text Chunking using Transformation-Based Learning, Ramshaw and Marcus, 1995
4. Code implementation
install tensorflow-addons. because Tensorflow 1 Implementation in contrib. tensorflow 2 Implementation in Tensorflow_addons Inside
pip install tensorflow-addonsTest example
import tensorflow_addons as tfa
import tensorflow as tf
import numpy as np
inputs=tf.random.truncated_normal([2,10,5])
target=tf.convert_to_tensor(np.random.randint(5,size=(2,10)),dtype=tf.int32)
out=tf.keras.layers.Softmax(inputs)
lens=tf.convert_to_tensor([9,6],dtype=tf.int32)
log_likelihood,tran_paras=tfa.text.crf_log_likelihood(inputs, target, lens)
batch_pred_sequence,batch_viterbi_score=tfa.text.crf_decode(inputs,tran_paras,lens)
loss=tf.reduce_sum(-log_likelihood)
print('log_likelihood is :',log_likelihood.numpy())
print('batch_pred_sequence is :',batch_pred_sequence.numpy())
print('loss is :',loss.numpy())
Output
log_likelihood is : [-18.046837 -14.958561]
batch_pred_sequence is : [[0 3 1 4 3 4 2 0 4 3]
[3 0 3 3 2 2 4 1 4 1]]
loss is : 33.005398边栏推荐
- MySQL-- Entity Framework Code First(EF Code First)
- utils session&rpc
- 云技能提升好伙伴,亚马逊云师兄今天正式营业
- utils 协程
- Dart development skills
- Niuke walks on the tree (ingenious application of parallel search)
- POJ 1753 flip game (DFS 𞓜 bit operation)
- Cronexpression expression explanation and cases
- 单片机 MCU 固件打包脚本软件
- 小程序手持弹幕的原理及实现(uni-app)
猜你喜欢

Summary of Android knowledge points and common interview questions

八大排序(二)

【Ubuntu-redis安装】

近期学习遇到的比较问题

Self service terminal handwritten Chinese character recognition input method library tjfink introduction

Express file download
![[new book recommendation] mongodb performance tuning](/img/2c/e5a814df4412a246c703ca548a4f68.png)
[new book recommendation] mongodb performance tuning

GPT (improving language understanding generative pre training) paper notes

Dart development skills

Express の post request
随机推荐
Solution to pychart's failure in importing torch package
MySQL index optimization miscellaneous
Framework program of browser self-service terminal based on IE kernel
直播带货源码开发中,如何降低直播中的延迟?
2021-10-20
9.JNI_ Necessary optimization design
Tablet PC based ink handwriting recognition input method
How to reduce the delay in live broadcast in the development of live broadcast source code with goods?
Bluetooth BT RF test (forwarding)
Cb/s Architecture - Implementation Based on cef3+mfc
Niuke rearrangement rule taking method
近期学习遇到的比较问题
Configuring MySQL for error reporting
Using OpenCV Net for image restoration
Summary of Android knowledge points and common interview questions
Initialize static resource demo
(zero) most complete JVM knowledge points
MySQL优化
Create thread pool demo
JVM tuning tool introduction and constant pool explanation