当前位置:网站首页>Explain Bleu in machine translation task in detail
Explain Bleu in machine translation task in detail
2022-07-07 07:09:00 【aelum】
Catalog
One 、 n n n Metagrammar (N-Gram)
n n n Metagrammar (n-gram) Refers to the text continuity The emergence of n n n individual Morpheme . When n n n Respectively 1 , 2 , 3 1,2,3 1,2,3 when ,n-gram It's also called unigram( Unary grammar )、bigram( Binary grammar ) and trigram( Ternary grammar ).
n n n The meta grammar model is based on n − 1 n-1 n−1 A probabilistic language model of order Markov chains ( That is, only the former is considered n − 1 n-1 n−1 When words appear , The probability of the latter word ):
unigram: P ( w 1 , w 2 , ⋯ , w T ) = ∏ i = 1 T P ( w i ) bigram: P ( w 1 , w 2 , ⋯ , w T ) = P ( x 1 ) ∏ i = 1 T − 1 P ( w i + 1 ∣ w i ) trigram: P ( w 1 , w 2 , ⋯ , w T ) = P ( x 1 ) P ( x 2 ∣ x 1 ) ∏ i = 1 T − 2 P ( w i + 2 ∣ w i , w i + 1 ) \begin{aligned} \text{unigram:}\quad&P(w_1,w_2,\cdots,w_T)=\prod_{i=1}^T P(w_i) \\ \text{bigram:}\quad&P(w_1,w_2,\cdots,w_T)=P(x_1)\prod_{i=1}^{T-1} P(w_{i+1}|w_i) \\ \text{trigram:}\quad&P(w_1,w_2,\cdots,w_T)=P(x_1)P(x_2|x_1)\prod_{i=1}^{T-2} P(w_{i+2}|w_{i},w_{i+1}) \\ \end{aligned} unigram:bigram:trigram:P(w1,w2,⋯,wT)=i=1∏TP(wi)P(w1,w2,⋯,wT)=P(x1)i=1∏T−1P(wi+1∣wi)P(w1,w2,⋯,wT)=P(x1)P(x2∣x1)i=1∏T−2P(wi+2∣wi,wi+1)
Two 、BLEU(Bilingual Evaluation Understudy)
2.1 BLEU Definition
BLEU( Pronunciation and words blue identical ) It was first used to evaluate the results of machine translation , But now it has been widely used to evaluate the quality of output sequences in many applications . For the prediction sequence pred
Any of the n n n Metagrammar , BLEU This is the assessment of n n n Whether the meta syntax appears in the tag sequence label
in .
BLEU The definition is as follows :
BLEU = exp ( min ( 0 , 1 − len(label) len(pred) ) ) ∏ n = 1 k p n 1 / 2 n \text{BLEU}=\exp\left(\min\left(0,1-\frac{\text{len(label)}}{\text{len(pred)}}\right)\right)\prod_{n=1}^kp_n^{1/2^n} BLEU=exp(min(0,1−len(pred)len(label)))n=1∏kpn1/2n
among len(*) \text{len(*)} len(*) Represents a sequence ∗ * ∗ The number of lexical elements in , k k k Used to match the longest n n n Metagrammar ( Constant access 4 4 4), p n p_n pn Express n n n The accuracy of meta grammar .
To be specific , Given label
: A , B , C , D , E , F A,B,C,D,E,F A,B,C,D,E,F and pred
: A , B , B , C , D A,B,B,C,D A,B,B,C,D, take k = 3 k=3 k=3.
First of all to see p 1 p_1 p1 How to calculate . We will first pred
Each of the unigram It's all figured out : ( A ) , ( B ) , ( B ) , ( C ) , ( D ) (A),(B),(B),(C),(D) (A),(B),(B),(C),(D), then label
Each of the unigram It's all figured out : ( A ) , ( B ) , ( C ) , ( D ) , ( E ) , ( F ) (A),(B),(C),(D),(E),(F) (A),(B),(C),(D),(E),(F), Then see how many matches there are between them ( Cannot match repeatedly , That is, one-to-one correspondence must be maintained ). It can be seen that there are 4 4 4 A match , and pred
There's a total of 5 5 5 individual unigram, therefore p 1 = 4 / 5 p_1=4/5 p1=4/5.
Look again. p 2 p_2 p2 How to calculate . We will first pred
Each of the bigram It's all figured out : ( A , B ) , ( B , B ) , ( B , C ) , ( C , D ) (A,B),(B,B),(B,C),(C,D) (A,B),(B,B),(B,C),(C,D), then label
Each of the bigram It's all figured out : ( A , B ) , ( B , C ) , ( C , D ) , ( D , E ) , ( E , F ) (A,B),(B,C),(C,D),(D,E),(E,F) (A,B),(B,C),(C,D),(D,E),(E,F), Then see how many matches there are between them . It can be seen that there are 3 3 3 A match , and pred
There's a total of 4 4 4 individual bigram, therefore p 2 = 3 / 4 p_2=3/4 p2=3/4.
Finally, let's see p 3 p_3 p3 How to calculate . We will first pred
Each of the trigram It's all figured out : ( A , B , B ) , ( B , B , C ) , ( B , C , D ) (A,B,B),(B,B,C),(B,C,D) (A,B,B),(B,B,C),(B,C,D), then label
Each of the trigram It's all figured out : ( A , B , C ) , ( B , C , D ) , ( C , D , E ) , ( D , E , F ) (A,B,C),(B,C,D),(C,D,E),(D,E,F) (A,B,C),(B,C,D),(C,D,E),(D,E,F), Then see how many matches there are between them . It can be seen that only 1 1 1 A match , and pred
There's a total of 3 3 3 individual trigram, therefore p 3 = 1 / 3 p_3=1/3 p3=1/3.
So in this case BLEU The score is
BLEU = exp ( min ( 0 , 1 − 6 / 5 ) ) ⋅ p 1 1 / 2 ⋅ p 2 1 / 4 ⋅ p 3 1 / 8 = e − 0.2 ⋅ ( 4 5 ) 1 / 2 ⋅ ( 3 4 ) 1 / 4 ⋅ ( 1 3 ) 1 / 8 ≈ 0.5940 \begin{aligned} \text{BLEU}&=\exp(\min(0,1-6/5))\cdot p_1^{1/2}\cdot p_2^{1/4}\cdot p_3^{1/8} \\ &=e^{-0.2}\cdot \left(\frac45\right)^{1/2}\cdot \left(\frac34\right)^{1/4}\cdot\left(\frac13\right)^{1/8} \\ &\approx0.5940 \end{aligned} BLEU=exp(min(0,1−6/5))⋅p11/2⋅p21/4⋅p31/8=e−0.2⋅(54)1/2⋅(43)1/4⋅(31)1/8≈0.5940
2.2 BLEU Discussion
according to BLEU The definition of , When the prediction sequence is exactly the same as the tag sequence ,BLEU The value of is 1 1 1. On the other hand , because e x > 0 e^x>0 ex>0 And p n ≥ 0 p_n\geq0 pn≥0, So there is
BLEU ∈ [ 0 , 1 ] \text{BLEU}\in[0,1] BLEU∈[0,1]
BLEU The closer the value of 1 1 1, It means the better the prediction effect ;BLEU The closer the value of 0 0 0, It means the worse the prediction effect .
Besides , because n n n The longer the metagrammar, the more difficult it is to match , therefore BLEU For longer n n n The accuracy of meta syntax assigns greater weight ( Fix a ∈ ( 0 , 1 ) a\in(0,1) a∈(0,1), be a 1 / 2 n a^{1/2^n} a1/2n Will follow n n n To increase by ). and , Because the shorter the prediction sequence is p n p_n pn The higher the value , So the coefficient exp ( ⋅ ) \exp(\cdot) exp(⋅) This term is used to punish shorter prediction sequences .
2.3 BLEU Simple implementation of
import math
from collections import Counter
def bleu(label, pred, k=4):
# Let's assume that the input label and pred Word segmentation has been carried out
score = math.exp(min(0, 1 - len(label) / len(pred)))
for n in range(1, k + 1):
# Use hash table to store label All of the n-gram
hashtable = Counter([' '.join(label[i:i + n]) for i in range(len(label) - n + 1)])
# The number of successful matches
num_matches = 0
for i in range(len(pred) - n + 1):
ngram = ' '.join(pred[i:i + n])
if ngram in hashtable and hashtable[ngram] > 0:
num_matches += 1
hashtable[ngram] -= 1
score *= math.pow(num_matches / (len(pred) - n + 1), math.pow(0.5, n))
return score
for example :
label = 'A B C D E F'
pred = 'A B B C D'
for i in range(4):
print(bleu(label.split(), pred.split(), k=i + 1))
# 0.7322950476607851
# 0.6814773296495302
# 0.5940339360503315
# 0.0
References
边栏推荐
- How can flinksql calculate the difference between a field before and after update when docking with CDC?
- Sword finger offer high quality code
- Complete process of MySQL SQL
- . Net core accesses uncommon static file types (MIME types)
- 2018年江苏省职业院校技能大赛高职组“信息安全管理与评估”赛项任务书第二阶段答案
- How to model and simulate the target robot [mathematical / control significance]
- 联合索引ABC的几种索引利用情况
- IP address
- Paranoid unqualified company
- Comment les entreprises gèrent - elles les données? Partager les leçons tirées des quatre aspects de la gouvernance des données
猜你喜欢
This article introduces you to the characteristics, purposes and basic function examples of static routing
LC interview question 02.07 Linked list intersection & lc142 Circular linked list II
MATLAB小技巧(29)多项式拟合 plotfit
Implementation of AVL tree
华为机试题素数伴侣
健身房如何提高竞争力?
Please answer the questions about database data transfer
Several index utilization of joint index ABC
Leetcode T1165: 日志分析
MySQL view bin log and recover data
随机推荐
Paranoid unqualified company
分布式id解决方案
【mysqld】Can't create/write to file
After the promotion, sales volume and flow are both. Is it really easy to relax?
ANR 原理及实践
Multithreading and high concurrency (9) -- other synchronization components of AQS (semaphore, reentrantreadwritelock, exchanger)
This article introduces you to the characteristics, purposes and basic function examples of static routing
毕业设计游戏商城
MySQL view bin log and recover data
How does an enterprise manage data? Share the experience summary of four aspects of data governance
toRefs API 与 toRef Api
Take you to brush (niuke.com) C language hundred questions (the first day)
2018年江苏省职业院校技能大赛高职组“信息安全管理与评估”赛项任务书
[explanation of JDBC and internal classes]
Bus消息总线
Matlab tips (30) nonlinear fitting lsqcurefit
异步组件和Suspense(真实开发中)
How can gyms improve their competitiveness?
Test of transform parameters of impdp
How can flinksql calculate the difference between a field before and after update when docking with CDC?