当前位置:网站首页>Hidden Markov model (HMM) learning notes
Hidden Markov model (HMM) learning notes
2022-07-07 07:12:00 【Wsyoneself】
- Markov chain : At any moment t, The value of the observation variable depends only on the current state variable , It has nothing to do with the state variables at other times, that is, the observed variables ; meanwhile , The current state value only depends on the state of the previous moment , It has nothing to do with anything else .
- Obtained by Markov HMM The joint probability distribution of
- A model must contain parameters , The essence of machine learning is to find a set of optimal parameters , Make the fitting effect of the model the best .
- HMM Parameters of : State transition probability ( Infer the probability of the next state from the current state ), Output observation probability ( The probability of inferring the observed value from the current state )
- Three basic questions :
- The problem of probability calculation : Given the model and observation sequence , Calculate the probability of occurrence of the observation sequence
- Learning problems : Known observation sequence , Estimate model parameters , Maximize the output probability of the observation sequence
- Prediction problem : Given the model and observation sequence , Find the conditional probability of a given observation sequence P(I|O) The largest hidden state I
- Algorithm :
- Forward algorithm : Given hidden Markov model , And to the moment t The sequence of observations , And the state is qi The forward probability of : It's actually from t=1 Start calculating , According to the implicit Markov hypothesis , Iterative calculation can get ( I understand : Forward probability is the moment t Transfer to time t+1 Probability , Multiply each state by the state transition probability of the current state , Because the final result is the observed value , We also need to multiply the observation probability at the last moment )
- Backward algorithm : Given hidden Markov model , And from t+1 Moment to T The sequence of observations , And the state is qi Backward probability : Suppose the probability of the last moment is 1, Then the inverse calculation of the conditional probability of reason is pushed forward
- Baum-welch Algorithm : If the sample data has no label , Then the training data only contains the observation sequence O, But the corresponding state I Unknown , Then the hidden Markov model at this time is a probability model with hidden variables .
The essence of parameter learning is still EM,EM The basic idea of is to first add the initial estimate of the parameter to the likelihood function , Then maximize the likelihood function ( It's usually derivative , Make it equal to 0), Get new parameter estimates , repeated , Until it converges .
Viterbi (Viterbi) Algorithm : It's a dynamic programming algorithm , Used to find the most likely sequence of observed events - Viterbi path - Implicit state sequence .
Generalization :
Given models and observations ,Forward The algorithm can calculate the probability of observing a specific sequence from the model ;Viterbi The algorithm can calculate the most likely internal state ;Baum-Welch The algorithm can be used for training HMM. When there is enough training data , use Baum-Welch Work out HMM State transition probability and observation probability function , And then you can use it Viterbi The algorithm calculates the most likely phoneme sequence behind each input speech . But if the amount of data is limited , Often first train some smaller HMM Used to identify each monosyllabic (monophone), Or triphones (triphone), Then put these small HMM String together to recognize continuous speech
For speech synthesis , Given a string of phonemes , Go to the database to find a bunch of small ones that best match this crosstalk HMM, String them into a long HMM, Stands for the whole sentence . Then according to this combination HMM, Calculate the sequence of speech parameters that are most likely to be observed , The rest is to generate speech from the parameter sequence . This is a simplification of the whole process , The main problem is , The speech parameters thus generated are discontinuous , because HMM The state of is discrete . To solve this problem ,Keiichi Tokuda Learn from the dynamic parameters widely used in speech recognition ( The first and second derivatives of parameters ), It is introduced into the parameter generation of speech synthesis , The coherence of generated speech has been greatly improved . The key is to use the recessive state , Such as grammar , The habit of using words, etc , Infer the output with higher probability .
- When the model parameters are known , The observation sequence is O when , The state can be any state , Then the observation sequence in each state is O The probability accumulation of is the probability of the observation sequence .
- For supervised learning , The state transition probability and output observation probability can be calculated directly from the data ( For participles , The observation sequence corresponds to the text sentence , The hidden state corresponds to the label of each word in the sentence )
- The observation sequence in speech recognition is language , The hidden state is text , The function of speech recognition is to convert speech into corresponding words .
Reference resources :
「 hidden Markov model 」(HMM-Based) How is it applied in speech synthesis ? - You know (zhihu.com)
边栏推荐
- 【mysqld】Can't create/write to file
- LC interview question 02.07 Linked list intersection & lc142 Circular linked list II
- Config distributed configuration center
- How to share the same storage among multiple kubernetes clusters
- 关于数据库数据转移的问题,求各位解答下
- Bus message bus
- 2018年江苏省职业院校技能大赛高职组“信息安全管理与评估”赛项任务书第二阶段答案
- 詳解機器翻譯任務中的BLEU
- Master-slave replication principle of MySQL
- Release notes of JMeter version 5.5
猜你喜欢
虚拟机的作用
Composition API premise
Nesting and splitting of components
Introduction to abnova's in vitro mRNA transcription workflow and capping method
. Net core accesses uncommon static file types (MIME types)
Unity3d learning notes
大咖云集|NextArch基金会云开发Meetup来啦
Esxi attaching mobile (Mechanical) hard disk detailed tutorial
Maze games based on JS
Basic process of network transmission using tcp/ip four layer model
随机推荐
Jetpack compose is much more than a UI framework~
$parent(获取父组件) 和 $root(获取根组件)
Esxi attaching mobile (Mechanical) hard disk detailed tutorial
Abnova membrane protein lipoprotein technology and category display
Freeswitch dials extension number source code tracking
After the promotion, sales volume and flow are both. Is it really easy to relax?
How DHCP router works
OOM(内存溢出)造成原因及解决方案
2018 Jiangsu Vocational College skills competition vocational group "information security management and evaluation" competition assignment
Introduction to abnova's in vitro mRNA transcription workflow and capping method
Abnova immunohistochemical service solution
云备份项目
虚拟机的作用
Config distributed configuration center
leetcode 509. Fibonacci number
This article introduces you to the characteristics, purposes and basic function examples of static routing
CompletableFuture使用详解
Composition API 前提
Bus message bus
js小练习----分时提醒问候、表单密码显示隐藏效果、文本框焦点事件、关闭广告