当前位置:网站首页>ML - natural language processing - Basics
ML - natural language processing - Basics
2022-07-25 15:24:00 【sword_ csdn】
Catalog
Reference resources
Huawei cloud College
https://www.cnblogs.com/pinard/p/7160330.html
Language model
Language model is a language abstract modeling based on the objective facts of language , It's a correspondence , Suppose there are the following problems :
(1) Machine translation (I have a dream):P( I have a dream )>P( I have a dream )
(2) Spelling correction :P(about fifteen minutes from)>P(about fifteenminuets from)
(3) speech recognition :P( You look like your mother )>P( You look like your mother )
(4) Phonetic conversion :P( What are you doing now? |nixianzaiganshenme)>P( What are you doing in Xi'an |nixianzaiganshenme)
If we formalize the above problem , The chain rule can be expressed as follows 
Neural network language model


N - gram Language model
utilize n Metamodel (n-gram model) Estimate conditional probability , That is, ignore the distance greater than or equal to n The influence of the above words , Therefore, if the ratio of frequency counting is used to calculate n The meta conditional probability can be expressed as :
NN The relationship between language model and statistical language model
The same thing : They regard a sentence as a sequence of words , Then calculate the probability of the sentence
Difference :
(1) How to calculate probability :N-gram Based on Markov hypothesis, only the former n Word ,NNLM Consider the context of the whole sentence .
(2) How to train the model :N-gram Calculate parameters based on maximum likelihood estimation , It is based on the word itself ;NNLM be based on RNN Optimization method training model .
(3) The cyclic neural network can store any length of context information in the hidden state , Not limited to N-gram Window restrictions in the model .
Text vectorization
Express the text as a series of vectors that can express the semantics of the text . Commonly used vectorization algorithms are :one-hot,TF-IDF,word2vec(CBOW,Skip-gram),doc2vec/str2vec(DM,DBOW).
word2vec - CBOW Model
CBOW The training input of the model is the word vector corresponding to the context related word of a characteristic word , And the output is the word vector of this particular word .
For example, the following paragraph , Our context size is 4, The specific word is "Learning", That is, the output word vector we need , The words corresponding to the context are 8 individual , Before and after 4 individual , this 8 One word is the input of our model . because CBOW The word bag model is used , So this 8 All words are equal , That is, regardless of the distance between them and the words we focus on , As long as it's within our context .
word2vec - Skip-gram Model
Skip-Gram Models and CBOW On the contrary , That is, the input is a specific word vector , The output is the context word vector corresponding to a specific word . Or the example above , Our context size is 4, The specific word "Learning" It's our input , And this 8 A contextual word is our output .
doc2vec - DM Model

Each paragraph is represented as a vector , The corresponding matrix D A column vector in , Each word is represented as a vector , The corresponding matrix W A column vector in . Paragraph vectors and word vectors are averaged or connected to the context (context) Predict the next word in .
doc2vec - DBOW Model

The model samples a text window in each iteration of random gradient descent (text window), Then randomly sample a word from the text window , So as to form a multi classification task for word prediction with a given paragraph vector . The model and Skip-gram The model is similar .
边栏推荐
猜你喜欢

ML - natural language processing - Key Technologies

Single or multiple human posture estimation using openpose

Spark SQL null value, Nan judgment and processing

如何解决Visual Studio中scanf编译报错的问题

Spark submission parameters -- use of files

spark分区算子partitionBy、coalesce、repartition

oracle_ 12505 error resolution

谷歌云盘如何关联Google Colab

分布式原理 - 什么是分布式系统

ML - 语音 - 深度神经网络模型
随机推荐
Maxcompute SQL 的查询结果条数受限1W
ML - 语音 - 深度神经网络模型
UITextField的inputView和inputAccessoryView注意点
Gbdt source code analysis of boosting
JS 同步、异步,宏任务、微任务概述
redis淘汰策列
ML - 语音 - 传统语音模型
Rediscluster setup and capacity expansion
从 join on 和 where 执行顺序认识T-sql查询执行顺序
Spark002---spark任务提交,传入json作为参数
oracle_ 12505 error resolution
Promise对象与宏任务、微任务
Spark AQE
Boosting之GBDT源码分析
记一次Spark foreachPartition导致OOM
Instance Tunnel 使用
图论及概念
Sublimetext-win10 cursor following problem
本地缓存--Ehcache
MeanShift聚类-01原理分析