当前位置:网站首页>Derivation of Kalman filter (KF) and extended Kalman filter (EKF)

Derivation of Kalman filter (KF) and extended Kalman filter (EKF)

2022-06-11 01:48:00 Captain xiaoyifeng

Background knowledge

Kalman filter is based on Bayesian filter and Gaussian distribution , So let's talk about these two first .

Bayesian filtering

First, some basic concepts and formulas are given

  • Joint distribution
    p ( x , y ) = p ( X = x , Y = y ) p(x,y)=p(X=x, Y=y) p(x,y)=p(X=x,Y=y)
    Express x,y Probability of simultaneous occurrence
  • Conditional probability
    p ( x ∣ y ) = p ( X = x ∣ Y = y ) p(x|y)=p(X=x|Y=y) p(xy)=p(X=xY=y)
    Express x stay y The probability of occurrence based on what has already occurred , set up x、y Are independent of each other ( It will be the same in the future ), Then there are
    p ( x ∣ y ) = p ( x , y ) p ( y ) p(x|y)=\frac{p(x,y)}{p(y)} p(xy)=p(y)p(x,y)
  • Prior probability
    It can be called experience , Before the current data is read , The probability distribution of the state estimated by the system
  • Posterior probability
    After reading the data ( After observation ), The probability distribution of the state obtained by integrating the prior probability and the observation probability
  • Law of total probability
    p ( x ) = ∑ y p ( x ∣ y ) p ( y ) ( leave scattered love condition ) p(x)=\sum_{y}p(x|y)p(y)( Discrete situation ) p(x)=yp(xy)p(y) leave scattered love condition
    p ( x ) = ∫ p ( x ∣ y ) p ( y ) d y ( even To continue love condition ) p(x)=\int p(x|y)p(y)dy( Continuous situation ) p(x)=p(xy)p(y)dy even To continue love condition
  • Bayes rule
    p ( x ∣ y ) = p ( y ∣ x ) p ( x ) p ( y ) = p ( y ∣ x ) p ( x ) ∑ x ′ p ( y ∣ x ′ ) p ( x ′ ) ( leave scattered ) p(x|y)=\frac{p(y|x)p(x)}{p(y)}=\frac{p(y|x)p(x)}{\sum_{x^{'}}p(y|x^{'})p(x^{'})}( discrete ) p(xy)=p(y)p(yx)p(x)=xp(yx)p(x)p(yx)p(x) leave scattered
    p ( x ∣ y ) = p ( y ∣ x ) p ( x ) p ( y ) = p ( y ∣ x ) p ( x ) ∫ p ( y ∣ x ′ ) p ( x ′ ) d x ′ ( even To continue ) p(x|y)=\frac{p(y|x)p(x)}{p(y)}=\frac{p(y|x)p(x)}{\int p(y|x^{'})p(x^{'})dx^{'}}( continuity ) p(xy)=p(y)p(yx)p(x)=p(yx)p(x)dxp(yx)p(x) even To continue
    among x It's called state ,y It's called data , p ( x ) p(x) p(x) It is called a priori probability , p ( y ∣ x ) p(y|x) p(yx) go by the name of “ The inverse ” Conditional probability , and p ( y ) − 1 p(y)^{-1} p(y)1 and x irrelevant , Often used as a coefficient η \eta η.
    It should be noted that , In general ,x Is used as an argument during the operation , our p yes x Probability distribution function of ( It is Gaussian distribution in Kalman filter , In some cases it may be a piecewise function ), Not specific x The probability of . During the transition of state , The predicted probability distribution consists of all the possible x Take the weighted integral / Sum to get
  • integrity / Markov sex
    Suppose a state x t x_t xt Can best predict the future , That is to say, all the control information in the past is contained in it and cannot have an impact on the future ( In other words, information from the past should be used in the future , Must depend on x t x_t xt), said x t x_t xt Is a full , Bayesian filtering is based on this assumption .

Bayesian filtering is based on Bayesian criterion , The basic method is to obtain a posteriori probability through a priori probability and observation value , Get a more accurate probability distribution .
The algorithm of Bayesian filtering is simply summarized as two ( 3、 ... and ) That's ok :

1 : b e l ‾ ( x t ) = ∫ p ( x t ∣ u t , x t − 1 ) b e l ( x t − 1 ) d x t − 1 ( even To continue ) 2 : b e l ‾ ( x t ) = ∑ p ( x t ∣ u t , x t − 1 ) b e l ( x t − 1 ) ( leave scattered ) 3 : b e l ( x t ) = η p ( z t ∣ x t ) b e l ‾ ( x t ) \begin{array}{l}1:\overline{bel}(x_t)=\int p(x_t|u_t,x_{t-1})bel(x_{t-1})dx_{t-1}( continuity )\\2:\overline{bel}(x_t)=\sum p(x_t|u_t,x_{t-1})bel(x_{t-1})( discrete )\\3:bel(x_t)=\eta p(z_t|x_t)\overline{bel}(x_t)\end{array} 1:bel(xt)=p(xtut,xt1)bel(xt1)dxt1 even To continue 2:bel(xt)=p(xtut,xt1)bel(xt1) leave scattered 3:bel(xt)=ηp(ztxt)bel(xt)

among :

  • η \eta η It's the normalization factor , In the process of calculation, various constants are proposed and constantly changed .
  • x x x Represents the state quantity
  • z z z It means observation data
  • u u u Represents the movement data recorded by the odometer
  • b e l 、 b e l ‾ bel、\overline{bel} belbel representative Confidence value , b e l ( x t ) = p ( x t ∣ z t ) , bel(x_t)=p(x_t|z_t), bel(xt)=p(xtzt), I.e. obtained after observation x t x_t xt A probability distribution , b e l ‾ \overline{bel} bel Represents the confidence value of the internal prediction .
  • p ( x t ∣ u t , x t − 1 ) p(x_t|u_t,x_{t-1}) p(xtut,xt1) Express State transition probability , Express x t − 1 、 u t x_{t-1}、u_t xt1ut In certain cases , x t x_t xt Probability distribution function of
  • p ( z t ∣ x t ) p(z_t|x_t) p(ztxt) Apparent representation x t x_t xt In certain cases , z t z_t zt Probability distribution of ( adopt z About x Function of ), In fact, because of x It is unknown. ,z It is known. , In the process of calculation x As an independent variable ,z As a constant .
  • That's ok 1、2 It stands for forecast Or call it Control updates
  • That's ok 3 It stands for Observation update

The premise of the algorithm is the integrity assumption , If used p ( x t ∣ u t , x t − 1 ) p(x_t|u_t,x_{t-1}) p(xtut,xt1) To express p ( x t ∣ u 1 : t , x 0 : t − 1 , z 1 : t − 1 ) p(x_t|u_{1:t},x_{0:t-1},z_{1:t-1}) p(xtu1:t,x0:t1,z1:t1)

Simple deduction

forecast

b e l ‾ ( x t ) = p ( x t ∣ z 1 : t − 1 , u 1 : t ) = ∫ p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) d x t − 1 = ∫ p ( x t ∣ x t − 1 , u t ) b e l ( x t − 1 ) d x t − 1 \begin{aligned} \overline{bel}(x_t)&=p(x_t|z_{1:t-1},u_{1:t})\\ &=\int p(x_t|x_{t-1},z_{1:t-1},u_{1:t})p(x_{t-1}|z_{1:t-1},u_{1:t})dx_{t-1}\\ &=\int p(x_t|x_{t-1},u_{t})bel(x_{t-1})dx_{t-1}\\ \end{aligned} bel(xt)=p(xtz1:t1,u1:t)=p(xtxt1,z1:t1,u1:t)p(xt1z1:t1,u1:t)dxt1=p(xtxt1,ut)bel(xt1)dxt1

Observation update

b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) = p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) p ( z t ∣ z 1 : t − 1 , u 1 : t ) = η p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) = η p ( z t ∣ x t ) b e l ‾ ( x t ) \begin{aligned} bel(x_t)=p(x_t|z_{1:t},u_{1:t})&=\frac{p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t})}{p(z_t|z_{1:t-1,u_{1:t}})}\\ &=\eta p(z_t|x_t,z_{1:t-1},u_{1:t})p(x_t|z_{1:t-1},u_{1:t})\\ &=\eta p(z_t|x_t)\overline{bel}(x_t) \end{aligned} bel(xt)=p(xtz1:t,u1:t)=p(ztz1:t1,u1:t)p(ztxt,z1:t1,u1:t)p(xtz1:t1,u1:t)=ηp(ztxt,z1:t1,u1:t)p(xtz1:t1,u1:t)=ηp(ztxt)bel(xt)

Gaussian distribution ( Normal distribution )

Bayesian filtering x The value range is discrete , So Bayesian filtering is not really an available algorithm , Bayesian filtering is based on Gaussian distribution Gauss filtering It can make x The values are continuous , Kalman filter is one of them .

The Gaussian distribution is represented by the following function
p ( x ) = ( 2 π σ 2 ) − 1 2 e − 1 2 ( x − μ ) 2 σ 2 ( x by mark The amount ) p(x)=(2\pi\sigma^2)^{-\frac12}e^{-\frac12\frac{(x-\mu)^2}{\sigma^2}}(x For the scalar ) p(x)=(2πσ2)21e21σ2(xμ)2x by mark The amount
among σ 2 \sigma^2 σ2 Is variance , μ \mu μ Is the mean
p ( x ‾ ) = ( 2 π Σ ) − 1 2 e − 1 2 ( x ‾ − μ ‾ ) T Σ − 1 ( x ‾ − μ ‾ ) ( x ‾ by towards The amount ) p(\underline{x})=(2\pi\Sigma)^{-\frac12}e^{-\frac12{(\underline{x}-\underline{\mu})^T}{\Sigma^{-1}(\underline{x}-\underline{\mu})}}(\underline{x} Vector ) p(x)=(2πΣ)21e21(xμ)TΣ1(xμ)x by towards The amount
among Σ \Sigma Σ Is variance ( Covariance matrix ), μ ‾ \underline{\mu} μ Is the mean
The reason why Gaussian distribution is used is that Gaussian distribution exists widely in nature , And has good properties , Although there are also shortcomings ( Unimodal ). In Kalman filter , There is no definite state , Measured value 、 Both the predicted quantity and the final correction value are expressed by Gaussian distribution .

In Gaussian distribution , p p p Integral ( Add up ) by 1, Of course, this is also the law of all probability distribution functions .

Fusion of Gaussian distributions

During the first derivation No, Use the following formula , But after using it, it seems that the following derivation is in vain … Dig a hole first

  • Gaussian distribution times Gaussian distribution or Gaussian distribution ( Structurally, it is obvious that )

  • For the product of Gaussian distribution X = X 1 X 2 X=X_1X_2 X=X1X2, among
    { X 1 ∼ N ( μ 1 , Σ 1 ) X 2 ∼ N ( μ 2 , Σ 2 ) \left\{\begin{array}{l}X_1\sim\mathbb{N}(\mu_1,\Sigma_1)\\X_2\sim\mathbb{N}(\mu_2,\Sigma_2)\end{array}\right. { X1N(μ1,Σ1)X2N(μ2,Σ2)
    You can get the following results :

{ K = Σ 1 ( Σ 1 + Σ 2 ) − 1 ) μ = μ 1 + K ( μ 2 − μ 1 ) Σ = Σ 1 − K Σ 1 \left\{\begin{array}{l}K=\Sigma_1(\Sigma_1+\Sigma_2)^{-1})\\\mu=\mu_1+K(\mu_2-\mu_1)\\\Sigma=\Sigma_1-K\Sigma_1\end{array}\right. K=Σ1(Σ1+Σ2)1)μ=μ1+K(μ2μ1)Σ=Σ1KΣ1
among K Is the Kalman gain , Prove slightly

Line generation correlation

Review by yourself / preview

Kalman filtering (KF)

Kalman filter is based on linear assumption , We assume that :
x t = A t x t − 1 + B t u t + ε t ( shape state turn move Letter Count ) x_t=A_tx_{t-1}+B_tu_t+\varepsilon_t( State transition function ) xt=Atxt1+Btut+εt shape state turn move Letter Count
among ε t \varepsilon_t εt Yes, the mean is 0, The variance of R t R_t Rt Gaussian random vector

z t = C t x t + δ t ( measuring The amount Letter Count ) z_t=C_tx_t+\delta_t( Measurement function ) zt=Ctxt+δt measuring The amount Letter Count
among δ t \delta_t δt Yes, the mean is 0, The variance of Q t Q_t Qt Gaussian random vector

b e l ( x 0 ) = p ( x 0 ) ( first beginning Set up Letter degree ) bel(x_0)=p(x_0)( Initial confidence ) bel(x0)=p(x0) first beginning Set up Letter degree
among p p p Yes, the mean is μ 0 \mu_0 μ0, The variance of Σ 0 \Sigma_0 Σ0 Gaussian random vector

The Kalman filter can be expressed as the following lines of algorithms

1 : μ ‾ t = A t μ t − 1 + B t u t 2 : Σ ‾ t = A t Σ t − 1 A t T + R t 3 : K t = Σ ‾ t C t T ( C t Σ ‾ t C t T + Q t ) − 1 4 : μ t = μ ‾ t + K t ( z t − C t μ ‾ t ) 5 : Σ t = ( I − K t C t ) Σ ‾ t \begin{array}{l} 1:\overline{\mu}_t=A_t\mu_{t-1}+B_tu_t \\ 2:\overline{\Sigma}_t=A_t\Sigma_{t-1}A_t^{T}+R_t\\ 3:K_t=\overline{\Sigma}_tC_t^T(C_t\overline{\Sigma}_tC_t^T+Q_t)^{-1}\\ 4:\mu_t=\overline{\mu}_t+K_t(z_t-C_t\overline{\mu}_t)\\ 5:\Sigma_t=(I-K_tC_t)\overline{\Sigma}_t \end{array} 1:μt=Atμt1+Btut2:Σt=AtΣt1AtT+Rt3:Kt=ΣtCtT(CtΣtCtT+Qt)14:μt=μt+Kt(ztCtμt)5:Σt=(IKtCt)Σt

Here's the proof

KF Mathematical derivation of

forecast

According to Bayesian filtering , We get
b e l ‾ ( x t ) = ∫ p ( x t ∣ u t , x t − 1 ) b e l ( x t − 1 ) d x t − 1 \overline{bel}(x_t)=\int p(x_t|u_t,x_{t-1})bel(x_{t-1})dx_{t-1} bel(xt)=p(xtut,xt1)bel(xt1)dxt1
So there is
b e l ‾ ( x t ) = η ∫ e − L t d x t − 1 \overline{bel}(x_t)=\eta\int e^{-L_t}dx_{t-1} bel(xt)=ηeLtdxt1
among
L t = 1 2 ( x t − A t x t − 1 − B t u t ) T R t − 1 ( x t − A t x t − 1 − B t u t ) + 1 2 ( x t − 1 − μ t − 1 ) T Σ t − 1 − 1 ( x t − 1 − μ t − 1 ) L_t=\frac12(x_t-A_tx_{t-1}-B_tu_t)^TR_t^{-1}(x_t-A_tx_{t-1}-B_tu_t)+\frac12(x_{t-1}-\mu_{t-1})^T\Sigma_{t-1}^{-1}(x_{t-1}-\mu_{t-1}) Lt=21(xtAtxt1Btut)TRt1(xtAtxt1Btut)+21(xt1μt1)TΣt11(xt1μt1)
L t L_t Lt yes x t x_t xt It's also x t − 1 x_{t-1} xt1 The quadratic function of
To avoid integral operations , We make
L t = L t ( x t − 1 , x t ) + L t ( x t ) L_t=L_t(x_{t-1},x_t)+L_t(x_t) Lt=Lt(xt1,xt)+Lt(xt)
Propose an item that does not include x t − 1 x_{t-1} xt1 Of L t ( x t ) L_t(x_t) Lt(xt)
b e l ‾ ( x t ) = η e − L t ( x t ) ∫ e − L t ( x t − 1 , x t ) d x t − 1 \overline{bel}(x_t)=\eta e^{-L_t(x_t)}\int e^{-L_t(x_{t-1},x_t)}dx_{t-1} bel(xt)=ηeLt(xt)eLt(xt1,xt)dxt1
The following L t ( x t − 1 , x t ) L_t(x_{t-1},x_t) Lt(xt1,xt) Constructed as Quadratic type ( After formulation )( It is my understanding that doing so will not raise the question of x t x_t xt The item )
Next, calculate L t L_t Lt About x t − 1 x_{t-1} xt1 The first and second derivatives of , obtain
L t ( x t − 1 , x t ) = 1 2 ( x t − 1 − Ψ [ A t T R t − 1 ( x t − B t u t ) + Σ t − 1 − 1 μ t − 1 ] ) T Ψ − 1 ( x t − 1 − Ψ [ A t T R t − 1 ( x t − B t u t ) + Σ t − 1 − 1 μ t − 1 ] ) L_t(x_{t-1},x_t)=\frac12(x_{t-1}-\Psi[A_t^TR_t^{-1}(x_t-B_tu_t)+\Sigma_{t-1}^{-1}\mu_{t-1}])^T\Psi^{-1}(x_{t-1}-\Psi[A_t^TR_t^{-1}(x_t-B_tu_t)+\Sigma_{t-1}^{-1}\mu_{t-1}]) Lt(xt1,xt)=21(xt1Ψ[AtTRt1(xtBtut)+Σt11μt1])TΨ1(xt1Ψ[AtTRt1(xtBtut)+Σt11μt1])

Again because
∫ d e t ( 2 π Ψ ) − 1 2 e − L t ( x t − 1 , x t ) d x t − 1 = 1 \int det(2\pi\Psi)^{-\frac12}e^{-L_t(x_{t-1},x_t)}dx_{t-1}=1 det(2πΨ)21eLt(xt1,xt)dxt1=1
therefore
∫ e − L t ( x t − 1 , x t ) d x t − 1 = d e t ( 2 π Ψ ) 1 2 \int e^{-L_t(x_{t-1},x_t)}dx_{t-1}=det(2\pi\Psi)^{\frac12} eLt(xt1,xt)dxt1=det(2πΨ)21

therefore
b e l ‾ ( x t ) = η e − L t ( x t ) \overline{bel}(x_t)=\eta e^{-L_t(x_t)} bel(xt)=ηeLt(xt)
Now calculate L t ( x t ) L_t(x_t) Lt(xt):
L t ( x t ) = L t − L t ( x t − 1 , x t ) = . . . ( contain x t − 1 term whole Ministry eliminate Go to ) = 1 2 ( x t − B t u t ) T R t − 1 ( x t − B t u t ) + 1 2 μ t − 1 T Σ t − 1 − 1 μ t − 1 − 1 2 [ A t T R t − 1 ( x t − B t u t ) + Σ t − 1 − 1 μ t − 1 ] T ( A t T R t − 1 A t + Σ t − 1 − 1 ) [ A t T R t − 1 ( x t − B t u t ) + Σ t − 1 − 1 μ t − 1 ] \begin{aligned} L_t(x_t)&=L_t-L_t(x_{t-1},x_t)\\ &=...( contain x_{t-1} All items are deleted )\\ &=\frac12(x_t-B_tu_t)^TR_t^{-1}(x_t-B_tu_t)+\frac12\mu_{t-1}^T\Sigma_{t-1}^{-1}\mu_{t-1}-\frac12[A_t^TR_t^{-1}(x_t-B_tu_t)+\Sigma_{t-1}^{-1}\mu_{t-1}]^T(A_t^TR_t^{-1}A_t+\Sigma_{t-1}^{-1})[A_t^TR_t^{-1}(x_t-B_tu_t)+\Sigma_{t-1}^{-1}\mu_{t-1}] \end{aligned} Lt(xt)=LtLt(xt1,xt)=...( contain xt1 term whole Ministry eliminate Go to )=21(xtBtut)TRt1(xtBtut)+21μt1TΣt11μt121[AtTRt1(xtBtut)+Σt11μt1]T(AtTRt1At+Σt11)[AtTRt1(xtBtut)+Σt11μt1]
Although this is not about x t x_t xt The quadratic form of ( After formulation ), But it's really a quadratic function , It just affects the previous coefficient .
Find a second derivative to get x t x_t xt The mean and variance of :
∂ L t ( x t ) ∂ x t = R t − 1 ( x t − B t u t ) − R t − 1 A t ( A t T R t − 1 A t + Σ t − 1 − 1 ) − 1 [ A t T R t − 1 ( x t − B t u t ) + Σ t − 1 − 1 μ t − 1 ] = [ R t − 1 − R t − 1 A t ( A t T R t − 1 A t + Σ t − 1 − 1 ) − 1 A t T R t − 1 ‾ ] ( x t − B t u t ) − R t − 1 A t ( A t T R t − 1 A t + Σ t − 1 − 1 ) − 1 Σ t − 1 − 1 μ t − 1 \begin{aligned} \frac{\partial L_t(x_t)}{\partial x_t}&=R_t^{-1}(x_t-B_tu_t)-R_t^{-1}A_t(A_t^TR_t^{-1}A_t+\Sigma_{t-1}^{-1})^{-1}[A_t^TR_t^{-1}(x_t-B_tu_t)+\Sigma_{t-1}^{-1}\mu_{t-1}]\\ &=[\underline{R_t^{-1}-R_t^{-1}A_t(A_t^TR_t^{-1}A_t+\Sigma_{t-1}^{-1})^{-1}A_t^TR_t^{-1}}](x_t-B_tu_t)-R_t^{-1}A_t(A_t^TR_t^{-1}A_t+\Sigma_{t-1}^{-1})^{-1}\Sigma_{t-1}^{-1}\mu_{t-1} \end{aligned} xtLt(xt)=Rt1(xtBtut)Rt1At(AtTRt1At+Σt11)1[AtTRt1(xtBtut)+Σt11μt1]=[Rt1Rt1At(AtTRt1At+Σt11)1AtTRt1](xtBtut)Rt1At(AtTRt1At+Σt11)1Σt11μt1

By Sherman Morrison formula ( It proved that the fight was too slow , Tips : Can pass [ A B C D ] [ x A x B ] = [ y A y B ] \begin{bmatrix} {A}&{B}\\ {C}&{D}\\ \end{bmatrix} \begin{bmatrix} {x_A}\\ {x_B}\\ \end{bmatrix} =\begin{bmatrix} {y_A}\\ {y_B}\\ \end{bmatrix} [ACBD][xAxB]=[yAyB] Find the block matrix [ A B C D ] \begin{bmatrix} {A}&{B}\\ {C}&{D}\\ \end{bmatrix} [ACBD] The expression of the two inverse matrices of , Using the equality of two inverse matrices, we get .)
R t − 1 − R t − 1 A t ( A t T R t − 1 A t + Σ t − 1 − 1 ) − 1 A t T R t − 1 = ( R t + A t Σ t − 1 A t T ) − 1 R_t^{-1}-R_t^{-1}A_t(A_t^TR_t^{-1}A_t+\Sigma_{t-1}^{-1})^{-1}A_t^TR_t^{-1}=(R_t+A_t\Sigma_{t-1}A_t^T)^{-1} Rt1Rt1At(AtTRt1At+Σt11)1AtTRt1=(Rt+AtΣt1AtT)1

therefore
∂ L t ( x t ) ∂ x t = ( R t + A t Σ t − 1 A t T ) − 1 ( x t − B t u t ) − R t − 1 A t ( A t T R t − 1 A t + Σ t − 1 − 1 ) − 1 Σ t − 1 − 1 μ t − 1 = 0 \begin{aligned} \frac{\partial L_t(x_t)}{\partial x_t}&=(R_t+A_t\Sigma_{t-1}A_t^T)^{-1}(x_t-B_tu_t)-R_t^{-1}A_t(A_t^TR_t^{-1}A_t+\Sigma_{t-1}^{-1})^{-1}\Sigma_{t-1}^{-1}\mu_{t-1}\\ &=0 \end{aligned} xtLt(xt)=(Rt+AtΣt1AtT)1(xtBtut)Rt1At(AtTRt1At+Σt11)1Σt11μt1=0

obtain
x t = B t u t + ( R t + A t Σ t − 1 A t T ) R t − 1 A t ( A t T R t − 1 A t + Σ t − 1 − 1 ) − 1 Σ t − 1 − 1 μ t − 1 = B t u t + A t ( I + Σ t − 1 A t T R t − 1 A t ) ( I + Σ t − 1 A t T R t − 1 A t ) − 1 μ t − 1 = B t u t + A t μ t − 1 \begin{aligned} x_t&=B_tu_t+(R_t+A_t\Sigma_{t-1}A_t^T)R_t^{-1}A_t(A_t^TR_t^{-1}A_t+\Sigma_{t-1}^{-1})^{-1}\Sigma_{t-1}^{-1}\mu_{t-1}\\ &=B_tu_t+A_t(I+\Sigma_{t-1}A_t^TR_t^{-1}A_t)(I+\Sigma_{t-1}A_t^TR_t^{-1}A_t)^{-1}\mu_{t-1}\\ &=B_tu_t+A_t\mu_{t-1} \end{aligned} xt=Btut+(Rt+AtΣt1AtT)Rt1At(AtTRt1At+Σt11)1Σt11μt1=Btut+At(I+Σt1AtTRt1At)(I+Σt1AtTRt1At)1μt1=Btut+Atμt1
that
μ ‾ t = A t μ t − 1 + B t u t \overline\mu_t=A_t\mu_{t-1}+B_tu_t μt=Atμt1+Btut
Σ ‾ t = [ ∂ 2 L t ( x t ) ∂ x t 2 ] − 1 = ( A t Σ t − 1 A t T + R t ) \overline\Sigma_{t}=[\frac{\partial^2 L_t(x_t)}{\partial x_t^2}]^{-1}=(A_t\Sigma_{t-1}A_t^T+R_t) Σt=[xt22Lt(xt)]1=(AtΣt1AtT+Rt)

Survey update

b e l ( x t ) = η p ( z t ∣ x t ) b e l ‾ ( x t ) = η e − J t bel(x_t)=\eta p(z_t|x_t)\overline{bel}(x_t)=\eta e^{-J_t} bel(xt)=ηp(ztxt)bel(xt)=ηeJt
among
J t = 1 2 ( z t − C t x t ) T Q t − 1 ( z t − C t x t ) + 1 2 ( x t − μ ‾ t ) T Σ t − 1 ( x t − μ ‾ t ) J_t=\frac12(z_t-C_tx_t)^TQ_t^{-1}(z_t-C_tx_t)+\frac12(x_{t}-\overline\mu_{t})^T\Sigma_{t}^{-1}(x_{t}-\overline\mu_{t}) Jt=21(ztCtxt)TQt1(ztCtxt)+21(xtμt)TΣt1(xtμt)
To find the derivative, we get
Σ t = ( C T Q t − 1 C t + Σ ‾ t − 1 ) − 1 \begin{aligned} \Sigma_t&=(C^TQ_t^{-1}C_t+\overline{\Sigma}_t^{-1})^{-1}\\ \end{aligned} Σt=(CTQt1Ct+Σt1)1
Because finding the minimum of a quadratic function , use μ t \mu_t μt Replace x t x_t xt:
C t T Q t − 1 ( z t − C t μ t ) = Σ ‾ t − 1 ( μ t − μ ‾ t ) C_t^TQ_t^{-1}(z_t-C_t\mu_t)=\overline{\Sigma}_t^{-1}(\mu_t-\overline{\mu}_t) CtTQt1(ztCtμt)=Σt1(μtμt)
Left edge = C t T Q t − 1 ( z t − C t μ t + C t μ ‾ t − C t μ ‾ t ) = C t T Q t − 1 ( z t − C t μ ‾ t ) − C t T Q t − 1 C t ( μ t − μ ‾ t ) \begin{aligned} On the left &=C_t^TQ_t^{-1}(z_t-C_t\mu_t+C_t\overline{\mu}_t-C_t\overline{\mu}_t)\\ &=C_t^TQ_t^{-1}(z_t-C_t\overline{\mu}_t)-C_t^TQ_t^{-1}C_t (\mu_t-\overline{\mu}_t)\end{aligned} Left edge =CtTQt1(ztCtμt+CtμtCtμt)=CtTQt1(ztCtμt)CtTQt1Ct(μtμt)
Dai Huide
C t T Q t − 1 ( z t − C t μ ‾ t ) = Σ t − 1 ( μ t − μ ‾ t ) Σ t C t T Q t − 1 ( z t − C t μ ‾ t ) = ( μ t − μ ‾ t ) C_t^TQ_t^{-1}(z_t-C_t\overline{\mu}_t)=\Sigma_t^{-1}(\mu_t-\overline{\mu}_t)\\ \Sigma_tC_t^TQ_t^{-1}(z_t-C_t\overline{\mu}_t)=(\mu_t-\overline{\mu}_t) CtTQt1(ztCtμt)=Σt1(μtμt)ΣtCtTQt1(ztCtμt)=(μtμt)
Make K = Σ t C t T Q t − 1 K=\Sigma_tC_t^TQ_t^{-1} K=ΣtCtTQt1, call K Is the Kalman gain
μ t − μ ‾ t = K ( z t − C t μ ‾ t ) \mu_t-\overline{\mu}_t=K(z_t-C_t\overline{\mu}_t) μtμt=K(ztCtμt)

K t = Σ t C t T Q t − 1 = Σ t C t T Q t − 1 ( C t Σ ‾ t C t T + Q t ) ( C t Σ ‾ t C t T + Q t ) − 1 = Σ t ( C t T Q t − 1 C t Σ ‾ t C t T + C t T Q t − 1 Q t ) ( C t Σ ‾ t C t T + Q t ) − 1 = Σ t ( C t T Q t − 1 C t Σ ‾ t C t T + C t T ) ( C t Σ ‾ t C t T + Q t ) − 1 = Σ t ( C t T Q t − 1 C t Σ ‾ t C t T + Σ ‾ t − 1 Σ ‾ t C t T ) ( C t Σ ‾ t C t T + Q t ) − 1 = Σ t ( C t T Q t − 1 C t + Σ ‾ t − 1 ) Σ ‾ t C t T ( C t Σ ‾ t C t T + Q t ) − 1 = Σ ‾ t C t T ( C t Σ ‾ t C t T + Q t ) − 1 \begin{aligned} K_t&=\Sigma_tC_t^TQ_t^{-1}\\ &=\Sigma_tC_t^TQ_t^{-1}(C_t\overline{\Sigma}_tC_t^T+Q_t)(C_t\overline{\Sigma}_tC_t^T+Q_t)^{-1}\\ &=\Sigma_t(C_t^TQ_t^{-1}C_t\overline{\Sigma}_tC_t^T+C_t^TQ_t^{-1}Q_t)(C_t\overline{\Sigma}_tC_t^T+Q_t)^{-1}\\ &=\Sigma_t(C_t^TQ_t^{-1}C_t\overline{\Sigma}_tC_t^T+C_t^T)(C_t\overline{\Sigma}_tC_t^T+Q_t)^{-1}\\ &=\Sigma_t(C_t^TQ_t^{-1}C_t\overline{\Sigma}_tC_t^T+\overline{\Sigma}_t^{-1}\overline{\Sigma}_tC_t^T)(C_t\overline{\Sigma}_tC_t^T+Q_t)^{-1}\\ &=\Sigma_t(C_t^TQ_t^{-1}C_t+\overline{\Sigma}_t^{-1})\overline{\Sigma}_tC_t^T(C_t\overline{\Sigma}_tC_t^T+Q_t)^{-1}\\ &=\overline{\Sigma}_tC_t^T(C_t\overline{\Sigma}_tC_t^T+Q_t)^{-1}\\ \end{aligned} Kt=ΣtCtTQt1=ΣtCtTQt1(CtΣtCtT+Qt)(CtΣtCtT+Qt)1=Σt(CtTQt1CtΣtCtT+CtTQt1Qt)(CtΣtCtT+Qt)1=Σt(CtTQt1CtΣtCtT+CtT)(CtΣtCtT+Qt)1=Σt(CtTQt1CtΣtCtT+Σt1ΣtCtT)(CtΣtCtT+Qt)1=Σt(CtTQt1Ct+Σt1)ΣtCtT(CtΣtCtT+Qt)1=ΣtCtT(CtΣtCtT+Qt)1

here , You can continue to simplify Σ t \Sigma_t Σt
Σ t = ( C T Q t − 1 C t + Σ ‾ t − 1 ) − 1 = Σ ‾ t − Σ ‾ t C t T ( Q t + C t Σ ‾ t C t T ) − 1 C t Σ ‾ t = [ ( I − K t C t ) ] Σ ‾ t \begin{aligned} \Sigma_t&=(C^TQ_t^{-1}C_t+\overline{\Sigma}_t^{-1})^{-1}\\ &=\overline{\Sigma}_t-\overline{\Sigma}_tC_t^T(Q_t+C_t\overline{\Sigma}_tC_t^T)^{-1}C_t\overline{\Sigma}_t\\ &=[(I-K_tC_t)]\overline{\Sigma}_t \end{aligned} Σt=(CTQt1Ct+Σt1)1=ΣtΣtCtT(Qt+CtΣtCtT)1CtΣt=[(IKtCt)]Σt

Extended Kalman filter (EKF)

Considering the nonlinear case
x t = g ( u t , x t − 1 ) + ε t z t = h ( x t ) + δ t x_t=g(u_t,x_{t-1})+\varepsilon_t\\ z_t=h(x_t)+\delta_t xt=g(ut,xt1)+εtzt=h(xt)+δt
Regarding this ,EKF Taylor expansion is proposed to retain the first derivative :
G t = ∂ g ( u t , x t − 1 ) ∂ x t − 1 , x t − 1 = μ t − 1 H t = ∂ h ( x t ) ∂ x t , x t = μ ‾ t G_t=\frac{\partial g(u_t,x_{t-1})}{\partial x_{t-1}},x_{t-1}=\mu_{t-1}\\ H_t=\frac{\partial h(x_{t})}{\partial x_{t}},x_t=\overline{\mu}_t Gt=xt1g(ut,xt1),xt1=μt1Ht=xth(xt),xt=μt
that
g ( u t , x t − 1 ) = g ( u t , μ t − 1 ) + G t ( x t − 1 − μ t − 1 ) h ( x t ) = h ( μ ‾ t ) + H t ( x t − μ ‾ t ) g(u_t,x_{t-1})=g(u_t,\mu_{t-1})+G_t(x_{t-1}-\mu_{t-1})\\ h(x_t)=h(\overline{\mu}_t)+H_t(x_t-\overline{\mu}_t) g(ut,xt1)=g(ut,μt1)+Gt(xt1μt1)h(xt)=h(μt)+Ht(xtμt)
similar KF, It can be proved that EKF as follows :

1 : μ ‾ t = g ( u t , μ t − 1 ) 2 : Σ ‾ t = G t Σ t − 1 G t T + R t 3 : K t = Σ ‾ t H t T ( H t Σ ‾ t H t T + Q t ) − 1 4 : μ t = μ ‾ t + K t ( z t − h ( μ ‾ t ) ) 5 : Σ t = ( I − K t H t ) Σ ‾ t \begin{array}{l} 1:\overline{\mu}_t=g(u_t,\mu_{t-1}) \\ 2:\overline{\Sigma}_t=G_t\Sigma_{t-1}G_t^{T}+R_t\\ 3:K_t=\overline{\Sigma}_tH_t^T(H_t\overline{\Sigma}_tH_t^T+Q_t)^{-1}\\ 4:\mu_t=\overline{\mu}_t+K_t(z_t-h(\overline{\mu}_t))\\ 5:\Sigma_t=(I-K_tH_t)\overline{\Sigma}_t \end{array} 1:μt=g(ut,μt1)2:Σt=GtΣt1GtT+Rt3:Kt=ΣtHtT(HtΣtHtT+Qt)14:μt=μt+Kt(zth(μt))5:Σt=(IKtHt)Σt

原网站

版权声明
本文为[Captain xiaoyifeng]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/162/202206110029367649.html