当前位置：网站首页>From Bayesian filter to Kalman filter (2)

From Bayesian filter to Kalman filter (2)

2022-07-28 19:14:00 【Learn something】

Previously, we introduced the discrete Bayesian filtering formula and the basic principle of Bayesian , Next, we will introduce Bayesian filtering of continuous random variables , This is what we will use more often in the future

The distribution of continuous random variables is often expressed by probability density , therefore , We set up ：

X It's the forecast ,Y Is the measured value , Both are random variables ; Lowercase x , y Is the corresponding specific value
$f_{X}(x)$ Represents the probability density distribution of the random variable of the predicted value , Characterize its probability distribution , A priori probability distribution
$f_{Y}(y)$ Represents the probability density distribution of the random variable of the measured value —— The probability distribution is used to describe the possibility of the measured value
$f_{Y|X}(y|x)$ Probability density distribution of likelihood probability , In random variables X=x Under the premise of （ It's worth it X=x, Guess is the true value ）, Predicted value random variable Y<y Probability
$f_{X|Y}(x|y)$ Posterior probability density distribution , The measured value is then variable Y=y Under the premise of , How sure is the predicted value （ probability ） yes X<x

Now we directly give the Bayesian filtering formula of continuous random variables , The specific derivation comes to the end , Understand according to the situation

$f_{X|Y}(x|y)=\frac{f_{Y|X}(y|x)f_{X}(x)}{f_{Y}(y)}=\eta f_{Y|X}(y|x)f_{X}(x)$
among , $\eta =\frac{1}{f_{Y}(y)}=\frac{1}{\int f(x,y)dx}=\frac{1}{\int f_{Y|X}(y|x)f_{X}(x)dx}$ The denominator is the marginal probability density , It is generally regarded as a constant

Introduce the random variables in probability theory

One dimensional random variables
Distribution function ：
discrete ： Distribution column and distribution function
Distribution column ： $X : \bigl(\begin{smallmatrix} -1 &0 &2 \\ 0.2& 0.5& 0.3 \end{smallmatrix}\bigr)$
Distribution function ： $F(x)=P(X<x)=\left\{\begin{matrix} 0 &x<-1 \\ 0.2 &-1<=x<0 \\ 0.7 &0<=x<2 \\ 1& x>=2 \end{matrix}\right.$
The distribution function is the integral of the distribution column , The distribution column of discrete random variables is point , The distribution function is stepped
Continuous type ： Probability density function and distribution function
Definition of distribution function ： $F(x)=P(X<x)=\int_{-\infty }^{x}f(t)dt$
among ,f It represents its probability density function —— Is the derivative of the distribution function
How to get the distribution function and probability density function ？ Experimentalize ？ Summarize modeling ？
Probability density ： $f(x)=\left\{\begin{matrix} \frac{1}{3}x &1<x<2 \\ \frac{1}{2}&2<=x<3 \\ 0&others \end{matrix}\right.$
Distribution function ： $F(x)=P(X<x)=\int_{-\infty }^{x}f(t)dt=\left\{\begin{matrix} 0 &x<1 \\ \int_{1}^{x}\frac{1}{3}tdt & 1<=x<2\\ \int_{1}^{2}\frac{1}{3}tdt+\int_{2}^{x}\frac{1}{2}dt &2<=x<3 \\ 1&x>=3 \end{matrix}\right.$
The distribution function is obtained by integrating the probability density function
Distribution of one-dimensional random variable function
for example ：
It is known that $f_{X}(x)=\left\{\begin{matrix} 1+x &-1<=x<0 \\ 1-x &0<=x<=1 \\ 0 &others \end{matrix}\right.$ , seek $Y=X^{2}+1$ Distribution function of
$P(Y<y)=P(X^{2}+1<y)=\left\{\begin{matrix} y<1: &P=0 \\ y>=2 &P=1 \\ 1<=y<2:&P(-\sqrt{y-1}<=X<=\sqrt{y-1}) \end{matrix}\right.$
At this time, it needs to be analyzed in combination with images （ Suddenly found that these just reviewed knowledge has forgotten , It seems that the understanding is not deep enough , It doesn't use much , It leads to forgetting ）

Two-dimensional random variable
In fact, it is equivalent to two random variables （ event ） The intersection of in a region
Conditional probability of two-dimensional random variables ：
$F_{X|Y}(x|y)=P(X<x|Y=y)$
（ To be added ）

Bayesian derivation
The measured value is Y=y Under the premise of , Predictive value X<x Probability ：
Find the posterior distribution function first $F_{X|Y}(x|y)$
$F_{X|Y}(x|y)=P(X<x|Y=y)=\sum_{u=-\infty }^{x}P(X=u|Y=y)=\sum_{u=-\infty }^{x}\frac{P(Y=y|X=u)P(X=u)}{P(Y=y)}$

Now let's look at this example of temperature measurement ：

1. Single point 、 Continuous random variable

In discrete form , Our subjective probability is divided into T=10 and T=11 These two only discrete points , however , This is obviously rough , And the continuous segment is more accurate , therefore , We can guess like this ：

We guess 0 The temperature at all times conforms to N ~（10,1） Such a normal distribution —— The average is 10（ The temperature is 10 Nearby fluctuations ）, The variance is 1（ Range of fluctuations ）. Obviously, you can get 10 All nearby are counted , Instead of the discrete type, which has only two desirable values . The mean and variance here include that it conforms to the normal distribution model , All belong to our guess ！ The variance reflects our trust in this mean guess , The more novel it is, the more we believe in this average . From this, we get the distribution function and probability density function of subjective probability ：

$f_{X}(x)=\frac{1}{\sqrt{2*Pi}}e^{-\frac{(x-10)^{2}}{2}}$

The value measured by the thermometer , At this moment, of course, a data is measured , such as ：y=9

Likelihood probability model , There are other possible types 、 Ladder type 、 Normal distribution type , We can take it as normal distribution here

Likelihood probability conforms to N ~（x,0.2）, among ,0.2 Is the accuracy of the thermometer ,x Is the predicted value of the current moment

$f_{Y|X}(y=9|x)=\frac{1}{\sqrt{2*Pi}*0.2}e^{-\frac{(9-x)^{2}}{2*0.2^{2}}}$

Then the posterior probability distribution can be obtained by using the continuous Bayesian filtering formula

$f_{X|Y}(x|y=9)=\eta f_{Y|X}(y|x)f_{X}(x)=\eta \frac{1}{\sqrt{2*Pi}*0.2}e^{-\frac{(9-x)^{2}}{2*0.2^{2}}} \frac{1}{\sqrt{2*Pi}}e^{-\frac{(x-10)^{2}}{2}}$
among , $\eta =\frac{1}{\int_{-\infty }^{+\infty }(\frac{1}{\sqrt{2*Pi}*0.2}e^{-\frac{(9-x)^{2}}{2*0.2^{2}}} \frac{1}{\sqrt{2*Pi}}e^{-\frac{(x-10)^{2}}{2}})dx}$ , That is to say, we need to find infinite integral
The end result is ： N ~ （9.0385, 0.038^2）, That is, the modified model , It is called a posteriori probability distribution

Summary ： We guess the temperature random variable at this time X0 The distribution of is normal , Likelihood probability model also belongs to normal distribution , After continuous Bayesian filtering formula, the modified distribution is obtained , It's still a normal distribution . however , Sometimes what you guess is not necessarily a normal distribution , It could be something else

2. Multipoint 、 Continuous random variable

Allied , For multiple random variables , Just use recursive relations to connect them , It is basically the same as the discrete type , It's just that the distribution column becomes the probability density

raw material ：
1. Recurrence relation between random variables $X_{k}=f(X_{k-1})+Q_{k}$ ; The relationship between the predicted value and the measured value $Y_{k}=h(X_{k})+R_{k}$
f It represents the functional relationship between adjacent random variables , And the following $f_{0},f_{1},f_{2},...,f_{k},$ It's not the same thing , Don't confuse ;h It represents the functional relationship between the random variable corresponding to the true value and the random variable corresponding to the measured value
2. hypothesis X0,Q1,...,Qk, R1,...,Rk Are independent of each other
3. There are observations y1,y2,...,yk
4. Set initial value 、 Distribution law ： $X_{0}$ ~ $f_{0}(x)$ , $Q_{k}$ ~ $f_{Q_{k}}(x)$ , $R_{k}$ ~ $f_{R_{k}}(x)$
5. Important theorem ： The conditions in conditional probability can be logically deduced
Random variables of process noise at each time ：Q0,Q1,Q2,...,Qk, That is to say, the noise interference in each prediction may be different （ In Kalman filter, these noises are directly considered to be the same ）
Express ： Predict the value of the current time according to the data of the previous time , There is prediction error
Random variable of observation noise at each time ：R0,R1,R2,...,Rk, That is, the noise interference during each measurement may be different （ In Kalman filter, these noises are directly considered to be the same ）
Express ： In fact, it reflects the accuracy of the sensor , Known truth value , However, the sensor does not exactly measure the true value , It's the error
Probability density function of process noise random variable at each time ： $f_{Q_{0}}(x),f_{Q_{1}}(x),f_{Q_{2}}(x),...,f_{Q_{k}}(x)$
Probability density function of random variable of observation noise at each time ： $f_{R_{0}}(x),f_{R_{1}}(x),f_{R_{2}}(x),...,f_{R_{k}}(x)$

wave filtering ：
1. Prediction step
What is required is the probability density , One method is to find the distribution function first , Then we can get the probability density function by taking the derivative ;（ Another method is convolution ）
What is known is X0 Probability density distribution , Now we need to predict X1 Probability density distribution , Find its distribution function first ：
$F_{1}^{-}(x)=P(X_{1}<=x)=\sum_{u=-\infty }^{x}P(X_{1}=u)$
Here is a continuous （ integral ） Split into discrete points （ Sum up ） Carry out operations （ Refer to the definition of integral ）, As for why , Estimation is because the most basic starting point of Bayes is the formula of discrete points .
Besides , Continuous distribution , The probability of a single point is actually undetectable ！！！ But why can it be counted here ？？？

$P(X_{1}=u)=\sum_{v=-\infty }^{+\infty }P(X_{1}=u|X_{0}=v)P(X_{0}=v)$
This step is obviously All probability formula , But why can it be so ？X1 and X0 It's independent , Why? X0 It can be used as X1 The condition of ？
In terms of time , There is X0 And then it came out X1？？ Or say The formula of total probability can be used for any complete set , Regardless of whether the two variables are related ？？

$\sum_{v=-\infty }^{+\infty }P(X_{1}=u|X_{0}=v)P(X_{0}=v)=\sum_{v=-\infty }^{+\infty }P(X_{1}-f(X0)=u-f(v)|X_{0}=v)P(X_{0}=v)$
This is a logical derivation based on the conditions in conditional probability , Because in the condition X0 = v, therefore f(X0) = f(v), Then add the same number on both sides of the previous equation , The equation still holds —— The equation of state is worked out $X_{1}=f(X_{0})+Q_{1}$

$\sum_{v=-\infty }^{+\infty }P(X_{1}-f(X0)=u-f(v)|X_{0}=v)P(X_{0}=v)=\sum_{v=-\infty }^{+\infty }P(Q_{1}=u-f(v)|X_{0}=v)P(X_{0}=v)$
This step is the transformation of variable roles ！ Using random variables Q1 To replace the X1

$\sum_{v=-\infty }^{+\infty }P(Q_{1}=u-f(v)|X_{0}=v)P(X_{0}=v)=\sum_{v=-\infty }^{+\infty }P(Q_{1}=u-f(v))P(X_{0}=v)$
The basis for this step is ： A random variable Q1 and X0 It's independent of each other , Therefore, they do not affect each other , So you can directly remove the condition X0 that will do （ Then the previous logical derivation also used X0=v As a precondition , Here, how can it become dispensable again ？）

$\sum_{v=-\infty }^{+\infty }P(Q_{1}=u-f(v))P(X_{0}=v)=\int_{-\infty }^{+\infty }f_{Q_{1}}(u-f(v))f_{0}(v)dv$
This step is also to refer to the definition of integral ; Be careful ： This is right v Do integral ,v To traverse all values , The final formula of integration is eliminated v And leave variables u ; There's another trick ： here f It's a probability density function ,F It's a distribution function ,F After derivation, we get f ;

$F_{1}^{-}(x)=P(X_{1}<=x)=\sum_{u=-\infty }^{x}P(X_{1}=u)=\sum_{u=-\infty }^{x}\sum_{v=-\infty }^{+\infty }P(Q_{1}=u-f(v))P(X_{0}=v)=\int_{-\infty }^{x}\int_{-\infty }^{+\infty }f_{Q_{1}}(u-f(v))f_{0}(v)dvdu$
Here we have a double integral , Be careful ： In the final formula, only variables are left x, and u and v Because of the integral, it doesn't exist ; thus , The distribution function of the prediction step is obtained , $f_{Q_{1}},f_{0}$ Is the condition we know , So just do the double integral to get the result （ It's just an infinite integral , Computers are troublesome ）

$f_{1}^{-}=\frac{d(F_{1}^{-})}{dx}=\int_{-\infty }^{+\infty }f_{Q_{1}}(x-f(v))f_{0}(v)dv$
This step is to derive the distribution function to obtain the probability density function , involves ： The method of deriving the integral formula —— $(\int_{\alpha (x)}^{\beta (x)}f(t)dt)' = \beta (x)'f(\beta (x))-\alpha (x)'f(\alpha (x))$
Sum up , According to what is known $f_{Q_{1}},f_{0}$ The probability density function of the predicted value is derived $f_{1}^{-}$ , Just do the integral
2. Update step
The prediction step is to find $F_{1}^{-}(x),f_{1}^{-}(x),$ that will do , What we use directly is the definition and nature of probability , Plus the known initial value X0 Probability density and prediction noise, etc ; The update step reflects the core of Bayesian filtering —— Update with Bayesian formula to find a posterior
$f_{X|Y}(x|y)=\eta f_{Y|X}(y|x)f_{X}(x)$

The first is to find the likelihood probability , Likelihood probability contains two variables , among x It's formal （ Can be regarded as a constant ）, and y Is the real independent variable , So the likelihood probability is about the measured value y Function of ; here y The definite value is obtained through measurement , Then the corresponding likelihood probability value is determined . We have not chosen a specific model to describe the likelihood probability here （ For example, use normal distribution ）, But a more general derivation （ Besides , The derivation here is slightly different from that above , Here is the definition of derivative , The above is from the distribution function ）
$f_{Y|X}(y=y_{1}|x)=\lim_{\xi \rightarrow 0}\frac{F(y_{1}+\xi|x)-F(y_{1}|x)}{\xi}$
This step is based on the definition of derivative . Is there such a question ： It doesn't mean that the accuracy of the sensor is constant , The likelihood probability of how to measure different temperatures is different ？ Accuracy is mainly reflected in variance .

$=\lim_{\xi \rightarrow 0}\frac{P(y_{1}<Y_{1}<y_{1}+\xi|X_{1}=x)}{\xi}=\lim_{\xi \rightarrow 0}\frac{P(y_{1}-h(x)<Y_{1}-h(X_{1})<y_{1}+\xi-h(x)|X_{1}=x)}{\xi}$
This step is based on the properties of continuous random variables and the logical derivation of conditional probability

$=\lim_{\xi \rightarrow 0}\frac{P(y_{1}-h(x)<R_{1}<y_{1}+\xi-h(x)|X_{1}=x)}{\xi}=\lim_{\xi \rightarrow 0}\frac{P(y_{1}-h(x)<R_{1}<y_{1}+\xi-h(x))}{\xi}$
This step is based on the observation equation $Y_{1}=h(X_{1})+R_{1}$ Replace variables ; also R1 And X1 It's independent of each other , Therefore, the condition can be removed directly X1（ Similar to the previous situation , Used in the front , Now discarded ）

$=\lim_{\xi \rightarrow 0}\frac{F_{R_{1}}(y_{1}-h(x)+\xi)-F_{R_{1}}(y_{1}-h(x))}{\xi}=f_{R_{1}}(y_{1}-h(x))$
This step is also based on the definition of derivatives
thus , The likelihood probability density is obtained
next , Bring the probability density and likelihood probability density of the predicted value into the Bayesian formula
$f_{1}^{+}=f_{X|Y}(x|y)=\eta f_{Y|X}(y|x)f_{X}(x)=f_{X_{1}|Y_{1}}(x|y_{1})=\eta f_{Y_{1}|X_{1}}(y_{1}|x)f_{1}^{-}(x)=\eta f_{R_{1}}(y_{1}-h(x))f_{1}^{-}(x)$
Again ： $f_{1}^{-},f_{1}^{+},$ Describe the X1 Probability density of random variable , And the single one f It represents the functional relationship between adjacent random variables , $f_{Y|X}(y|x)$ The probability density of likelihood probability
In this way, the predicted value is successfully corrected with the help of the observed value and Bayesian formula , The posterior probability distribution is obtained , The optimal value of the random variable can be obtained by calculating the mean value
3. iteration
The posterior probability distribution at this time is used to find the prior probability distribution at the next time The basis of , In combination with the equation of state, the prior probability distribution at the next moment can be obtained , The formula reflects （ The derivation process is the same as the prediction step above ）：
$f_{2}^{-}=\frac{d(F_{2}^{-})}{dx}=\int_{-\infty }^{+\infty }f_{Q_{2}}(x-f(v))f_{1}^{+}(v)dv$
So you can iterate , The random variables at each time are automatically filtered