Copyright | (c) 2016 FP Complete Corporation |
---|---|

License | MIT (see LICENSE) |

Maintainer | dominic@steinitz.org |

Safe Haskell | None |

Language | Haskell2010 |

# The Theory

The model for the extended Kalman filter is given by

\[ \begin{aligned} \boldsymbol{x}_i &= \boldsymbol{a}_i(\boldsymbol{x}_{i-1}, \boldsymbol{u}_{i-1}) + \boldsymbol{\psi}_{i-1} \\ \boldsymbol{y}_i &= \boldsymbol{h}_i(\boldsymbol{x}_i) + \boldsymbol{\upsilon}_i \end{aligned} \]

where

- \(\boldsymbol{a_i}\) is some non-linear vector-valued possibly time-varying state update function.
- \(\boldsymbol{x_i}\) is the state at time \(i\).
- \(\boldsymbol{u_i}\) is the control at time \(i\).
- \(\boldsymbol{\psi}_{i}\) are independent normally distributed random variables with mean 0 representing the fact that the state update is noisy: \(\boldsymbol{\psi}_{i} \sim {\cal{N}}(0,Q_i)\).
- \(\boldsymbol{h}_i\) is some non-linear vector-valued possibly time-varying function describing how we observe the hidden state process.
- \(\boldsymbol{y_i}\) is the observable at time \(i\).
- \(\boldsymbol{\upsilon}_i\) are independent normally distributed random variables with mean 0 represent the fact that the observations are noisy: \(\boldsymbol{\upsilon}_{i} \sim {\cal{N}}(0,R_i)\).

Note that in most presentations of the Kalman filter (the wikipedia presentation being an exception), the state update function and the observation function are taken to be constant over time as is the noise for the state and the noise for the observation. In symbols, \(\forall i \, \boldsymbol{a}_i = \boldsymbol{a}\) for some \(\boldsymbol{a}\), \(\forall i \, \boldsymbol{h}_i = \boldsymbol{h}\) for some \(\boldsymbol{h}\), \(\forall i \, Q_i = Q\) for some \(Q\) and \(\forall i \, \boldsymbol{a}_i = R\) for some \(R\).

We assume the whole process starts at 0 with our belief of the state (aka the prior state) being given by \(\boldsymbol{x}_0 \sim {\cal{N}}(\boldsymbol{\mu}_0, \Sigma_0)\)

The Kalman filtering process is a recursive procedure as follows:

- Make a prediction of the current system state, given our previous estimation of the system state.
- Update our prediction, given a newly acquired measurement.

The prediction and update steps depend on our system and measurement models, as well as our current estimate of the system state.

# An Extended Example

The equation of motion for a pendulum of unit length subject to Gaussian white noise is

\[ \frac{\mathrm{d}^2\alpha}{\mathrm{d}t^2} = -g\sin\alpha + w(t) \]

We can discretize this via the usual Euler method

\[ \begin{bmatrix} x_{1,i} \\ x_{2,i} \end{bmatrix} = \begin{bmatrix} x_{1,i-1} + x_{2,i-1}\Delta t \\ x_{2,i-1} - g\sin x_{1,i-1}\Delta t \end{bmatrix} + \mathbf{q}_{i-1} \]

where \(q_i \sim {\mathcal{N}}(0,Q)\) and

\[ Q = \begin{bmatrix} \frac{q^c \Delta t^3}{3} & \frac{q^c \Delta t^2}{2} \\ \frac{q^c \Delta t^2}{2} & {q^c \Delta t} \end{bmatrix} \]

Assume that we can only measure the horizontal position of the pendulum and further that this measurement is subject to error so that

\[ y_i = \sin x_i + r_i \]

where \(r_i \sim {\mathcal{N}}(0,R)\).

First let's set the time step and the acceleration caused by earth's gravity.

{-# LANGUAGE DataKinds #-} import Numeric.Kalman deltaT, g :: Double deltaT = 0.01 g = 9.81 bigQ :: Sym 2 bigQ = sym $ matrix bigQl qc :: Double qc = 0.01 bigQl :: [Double] bigQl = [ qc * deltaT^3 / 3, qc * deltaT^2 / 2, qc * deltaT^2 / 2, qc * deltaT ] bigR :: Sym 1 bigR = sym $ matrix [0.1] stateUpdate :: R 2 -> R 2 stateUpdate u = vector [x1 + x2 * deltaT, x2 - g * (sin x1) * deltaT] where (x1, w) = headTail u (x2, _) = headTail w observe :: R 2 -> R 1 observe a = vector [sin x] where x = fst $ headTail a linearizedObserve :: R 2 -> L 1 2 linearizedObserve a = matrix [cos x, 0.0] where x = fst $ headTail a linearizedStateUpdate :: R 2 -> Sq 2 linearizedStateUpdate u = matrix [1.0, deltaT, -g * (cos x1) * deltaT, 1.0] where (x1, _) = headTail u

Now we can create extended and unscented filters which consume a single observation.

singleEKF = runEKF (const observe) (const linearizedObserve) (const bigR) (const stateUpdate) (const linearizedStateUpdate) (const bigQ) undefined

singleUKF = runUKF (const observe) (const bigR) (const stateUpdate) (const bigQ) undefined

We start off not too far from the actual value.

initialDist = (vector [1.6, 0.0], sym $ matrix [0.1, 0.0, 0.0, 0.1])

Using some data generated using code made available with Simo Särkkä's book, we can track the pendulum using the extended Kalman filter.

multiEKF obs = scanl singleEKF initialDist (map (vector . pure) obs)

And then plot the results.

And also track it using the unscented Kalman filter.

multiUKF obs = scanl singleUKF initialDist (map (vector . pure) obs)

And also plot the results

### Code for Plotting

The full code for plotting the results:

import qualified Graphics.Rendering.Chart as C import Graphics.Rendering.Chart.Backend.Diagrams import Data.Colour import Data.Colour.Names import Data.Default.Class import Control.Lens import Data.Csv import System.IO hiding ( hGetContents ) import Data.ByteString.Lazy ( hGetContents ) import qualified Data.Vector as V import Numeric.LinearAlgebra.Static chartEstimated :: String -> [(Double, Double)] -> [(Double, Double)] -> [(Double, Double)] -> C.Renderable () chartEstimated title acts obs ests = C.toRenderable layout where actuals = C.plot_lines_values .~ [acts] $ C.plot_lines_style . C.line_color .~ opaque red $ C.plot_lines_title .~ "Actual Trajectory" $ C.plot_lines_style . C.line_width .~ 1.0 $ def measurements = C.plot_points_values .~ obs $ C.plot_points_style . C.point_color .~ opaque blue $ C.plot_points_title .~ "Measurements" $ def estimas = C.plot_lines_values .~ [ests] $ C.plot_lines_style . C.line_color .~ opaque black $ C.plot_lines_title .~ "Inferred Trajectory" $ C.plot_lines_style . C.line_width .~ 1.0 $ def layout = C.layout_title .~ title $ C.layout_plots .~ [C.toPlot actuals, C.toPlot measurements, C.toPlot estimas] $ C.layout_y_axis . C.laxis_title .~ "Angle / Horizontal Displacement" $ C.layout_y_axis . C.laxis_override .~ C.axisGridHide $ C.layout_x_axis . C.laxis_title .~ "Time" $ C.layout_x_axis . C.laxis_override .~ C.axisGridHide $ def diagE = do h <- openFile "data/matlabRNGs.csv" ReadMode cs <- hGetContents h let df = (decode NoHeader cs) :: Either String (V.Vector (Double, Double)) case df of Left _ -> error "Whatever" Right generatedSamples -> do let xs = take 500 (multiEKF $ V.toList $ V.map fst generatedSamples) let mus = map (fst . headTail . fst) xs let obs = V.toList $ V.map fst generatedSamples let acts = V.toList $ V.map snd generatedSamples denv <- defaultEnv C.vectorAlignmentFns 600 500 let charte = chartEstimated "Extended Kalman Filter" (zip [0,1..] acts) (zip [0,1..] obs) (zip [0,1..] mus) return $ fst $ runBackend denv (C.render charte (600, 500)) diagU = do h <- openFile "data/matlabRNGs.csv" ReadMode cs <- hGetContents h let df = (decode NoHeader cs) :: Either String (V.Vector (Double, Double)) case df of Left _ -> error "Whatever" Right generatedSamples -> do let ys = take 500 (multiUKF $ V.toList $ V.map fst generatedSamples) let nus = map (fst . headTail . fst) ys let obs = V.toList $ V.map fst generatedSamples let acts = V.toList $ V.map snd generatedSamples denv <- defaultEnv C.vectorAlignmentFns 600 500 let charte = chartEstimated "Unscented Kalman Filter" (zip [0,1..] acts) (zip [0,1..] obs) (zip [0,1..] nus) return $ fst $ runBackend denv (C.render charte (600, 500))

- runKF :: (KnownNat m, KnownNat n) => (a -> R n -> L m n) -> (a -> Sym m) -> (a -> R n -> Sq n) -> (a -> Sym n) -> a -> (R n, Sym n) -> R m -> (R n, Sym n)
- runKFPrediction :: KnownNat n => (a -> R n -> Sq n) -> (a -> Sym n) -> a -> (R n, Sym n) -> (R n, Sym n)
- runKFUpdate :: (KnownNat m, KnownNat n) => (a -> R n -> L m n) -> (a -> Sym m) -> a -> (R n, Sym n) -> R m -> (R n, Sym n)
- runEKF :: (KnownNat m, KnownNat n) => (a -> R n -> R m) -> (a -> R n -> L m n) -> (a -> Sym m) -> (a -> R n -> R n) -> (a -> R n -> Sq n) -> (a -> Sym n) -> a -> (R n, Sym n) -> R m -> (R n, Sym n)
- runEKFPrediction :: KnownNat n => (a -> R n -> R n) -> (a -> R n -> Sq n) -> (a -> Sym n) -> a -> (R n, Sym n) -> (R n, Sym n)
- runEKFUpdate :: (KnownNat m, KnownNat n) => (a -> R n -> R m) -> (a -> R n -> L m n) -> (a -> Sym m) -> a -> (R n, Sym n) -> R m -> (R n, Sym n)
- runUKF :: (KnownNat m, KnownNat n) => (a -> R n -> R m) -> (a -> Sym m) -> (a -> R n -> R n) -> (a -> Sym n) -> a -> (R n, Sym n) -> R m -> (R n, Sym n)
- runUKFPrediction :: KnownNat n => (a -> R n -> R n) -> (a -> Sym n) -> a -> (R n, Sym n) -> (R n, Sym n)
- runUKFUpdate :: (KnownNat n, KnownNat m) => (a -> R n -> R m) -> (a -> Sym m) -> a -> (R n, Sym n) -> R m -> (R n, Sym n)
- runKS :: KnownNat n => (a -> R n -> Sq n) -> (a -> Sym n) -> a -> (R n, Sym n) -> (R n, Sym n) -> (R n, Sym n)
- runEKS :: KnownNat n => (a -> R n -> R n) -> (a -> R n -> Sq n) -> (a -> Sym n) -> a -> (R n, Sym n) -> (R n, Sym n) -> (R n, Sym n)
- runUKS :: KnownNat n => (a -> R n -> R n) -> (a -> Sym n) -> a -> (R n, Sym n) -> (R n, Sym n) -> (R n, Sym n)

# Documentation

:: (KnownNat m, KnownNat n) | |

=> (a -> R n -> L m n) | Linear measurement operator at a point \(\boldsymbol{h}_i\) |

-> (a -> Sym m) | Covariance matrix encoding measurement noise \(R_i\) |

-> (a -> R n -> Sq n) | Linear system evolution at a point \(\boldsymbol{a}_i(\boldsymbol{u}_{i-1}, \boldsymbol{x}_{i-1})\), note the order of the control and the input |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current estimate \((\hat{\boldsymbol{x}}_{i-1}, \hat{\boldsymbol{\Sigma}}_{i-1})\) |

-> R m | New measurement \(\boldsymbol{y}_i\) |

-> (R n, Sym n) | New (filtered) estimate \((\hat{\boldsymbol{x}}_i, \hat{\boldsymbol{\Sigma}}_i)\) |

The ordinary Kalman filter is a special case of the extended Kalman filter, when the state update and measurement transformations are already linear.

:: KnownNat n | |

=> (a -> R n -> Sq n) | Linear system evolution at a point \(\boldsymbol{a}_i(\boldsymbol{u}_{i-1}, \boldsymbol{x}_{i-1})\), note the order of the control and the input |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current estimate \((\hat{\boldsymbol{x}}_{i-1}, \hat{\boldsymbol{\Sigma}}_{i-1})\) |

-> (R n, Sym n) | New prediction \(({\boldsymbol{x}_i^\flat}, {\boldsymbol{\Sigma}_i^\flat})\) |

:: (KnownNat m, KnownNat n) | |

=> (a -> R n -> L m n) | Linear measurement operator at a point \(\boldsymbol{h}_i\) |

-> (a -> Sym m) | Covariance matrix encoding measurement noise \(R_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current prediction \(({\boldsymbol{x}}_i^\flat, {\boldsymbol{\Sigma}}_i^\flat)\) |

-> R m | New measurement \(\boldsymbol{y}_i\) |

-> (R n, Sym n) | Updated prediction \((\hat{\boldsymbol{x}}_i, \hat{\boldsymbol{\Sigma}}_i)\) |

:: (KnownNat m, KnownNat n) | |

=> (a -> R n -> R m) | System measurement function \(\boldsymbol{h}_i\) |

-> (a -> R n -> L m n) | Linearization of the measurement at a point \(\frac{\partial \boldsymbol{h}_i}{\partial \boldsymbol{x}}\big|_{\boldsymbol{x}}\) |

-> (a -> Sym m) | Covariance matrix encoding measurement noise \(R_i\) |

-> (a -> R n -> R n) | System evolution function \(\boldsymbol{a}_i(\boldsymbol{u}, \boldsymbol{x})\), note the order of the control and the input |

-> (a -> R n -> Sq n) | Linearization of the system evolution at a point \(\frac{\partial \boldsymbol{a}_i}{\partial \boldsymbol{x}}\big|_{{\boldsymbol{u}}, \, {\boldsymbol{x}}}\) |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current estimate \((\hat{\boldsymbol{x}}_{i-1}, \hat{\boldsymbol{\Sigma}}_{i-1})\) |

-> R m | New measurement \(\boldsymbol{y}_i\) |

-> (R n, Sym n) | New (filtered) estimate \((\hat{\boldsymbol{x}}_i, \hat{\boldsymbol{\Sigma}}_i)\) |

Here we combine the prediction and update setps applied to a new measurement, thereby creating a single step of the (extended) Kalman filter.

:: KnownNat n | |

=> (a -> R n -> R n) | System evolution function \(\boldsymbol{a}_i(\boldsymbol{u}, \boldsymbol{x})\), note the order of the control and the input |

-> (a -> R n -> Sq n) | Linearization of the system evolution at a point \(\frac{\partial \boldsymbol{a}_i}{\partial \boldsymbol{x}}\big|_{{\boldsymbol{u}}, \, {\boldsymbol{x}}}\) |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current estimate \((\hat{\boldsymbol{x}}_{n-1}, \hat{\boldsymbol{\Sigma}}_{n-1})\) |

-> (R n, Sym n) | New prediction \(({\boldsymbol{x}_i^\flat}, {\boldsymbol{\Sigma}_i^\flat})\) |

Given our system model and our previous estimate of the system state, we generate a prediction of the current system state by taking the mean of our estimated state as our point estimate and evolving it one step forward with the system evolution function. In other words, we predict that the current system state is the minimum mean squared error (MMSE) state, which corresponds to the maximum a posteriori (MAP) estimate, from the previous estimate evolved one step by the system evolution function of our system model.

We also generate a predicted covariance matrix by updating the current covariance and adding any noise estimate. Updating the current covariance requires the linearized version of the system evolution, consequently the deviation of the actual system evolution from its linearization is manifested in the newly predicted covariance.

Taken together, the prediction step generates a new Gaussian distribution of predicted system states with mean the evolved MAP state from the previous estimate.

:: (KnownNat m, KnownNat n) | |

=> (a -> R n -> R m) | System measurement function \(\boldsymbol{h}_i\) |

-> (a -> R n -> L m n) | Linearization of the observation at a point \(\frac{\partial \boldsymbol{h}_i}{\partial \boldsymbol{x}}\big|_{\boldsymbol{x}}\) |

-> (a -> Sym m) | Covariance matrix encoding measurement noise \(R_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current prediction \(({\boldsymbol{x}}_i^\flat, {\boldsymbol{\Sigma}}_i^\flat)\) |

-> R m | New measurement \(\boldsymbol{y}_i\) |

-> (R n, Sym n) | Updated prediction \((\hat{\boldsymbol{x}}_i, \hat{\boldsymbol{\Sigma}}_i)\) |

After a new measurement has been taken, we update our original prediction of the current system state using the result of this measurement. This is accomplished by looking at how much the measurement deviated from our prediction, and then updating our estimated state accordingly.

This step requires the linearized measurement transformation.

:: (KnownNat m, KnownNat n) | |

=> (a -> R n -> R m) | System measurement function \(\boldsymbol{h}_i\) |

-> (a -> Sym m) | Covariance matrix encoding measurement noise \(R_i\) |

-> (a -> R n -> R n) | System evolution function \(\boldsymbol{a}_i(\boldsymbol{u}, \boldsymbol{x})\), note the order of the control and the input |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current estimate \((\hat{\boldsymbol{x}}_{i-1}, \hat{\boldsymbol{\Sigma}}_{i-1})\) |

-> R m | New measurement \(\boldsymbol{y}_i\) |

-> (R n, Sym n) | New (filtered) estimate \((\hat{\boldsymbol{x}}_i, \hat{\boldsymbol{\Sigma}}_i)\) |

:: KnownNat n | |

=> (a -> R n -> R n) | System evolution function at a point \(\boldsymbol{a}_i(\boldsymbol{u}, \boldsymbol{x})\), note the order of the control and the input |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current estimate \((\hat{\boldsymbol{x}}_{n-1}, \hat{\boldsymbol{\Sigma}}_{n-1})\) |

-> (R n, Sym n) | Prediction \(({\boldsymbol{x}_i^\flat}, {\boldsymbol{\Sigma}_i^\flat})\) |

:: (KnownNat n, KnownNat m) | |

=> (a -> R n -> R m) | Measurement transformation \(\boldsymbol{h}_i\) |

-> (a -> Sym m) | Covariance matrix encoding measurement noise \(R_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i-1}\) |

-> (R n, Sym n) | Current prediction \(({\boldsymbol{x}}_i^\flat, {\boldsymbol{\Sigma}}_i^\flat)\) |

-> R m | New measurement \(\boldsymbol{y}_i\) |

-> (R n, Sym n) | Updated prediction \((\hat{\boldsymbol{x}}_i, \hat{\boldsymbol{\Sigma}}_i)\) |

:: KnownNat n | |

=> (a -> R n -> Sq n) | Linear system evolution at a point \(\boldsymbol{a}_i(\boldsymbol{u}, \boldsymbol{x})\), note the order of the control and the input |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i}\) |

-> (R n, Sym n) | Future smoothed estimate \((\bar{\boldsymbol{x}}_{i+1}, \bar{\boldsymbol{\Sigma}}_{i+1})\) |

-> (R n, Sym n) | Present filtered estimate \((\hat{\boldsymbol{x}}_{i}, \hat{\boldsymbol{\Sigma}}_{i})\) |

-> (R n, Sym n) | Present smoothed estimate \((\bar{\boldsymbol{x}}_{i}, \bar{\boldsymbol{\Sigma}}_{i})\) |

:: KnownNat n | |

=> (a -> R n -> R n) | |

-> (a -> R n -> Sq n) | Linearization of the system evolution at a point \(\frac{\partial \boldsymbol{a}_i}{\partial \boldsymbol{x}}\big|_{{\boldsymbol{u}}, \, {\boldsymbol{x}}}\) |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i}\) |

-> (R n, Sym n) | Future smoothed estimate \((\bar{\boldsymbol{x}}_{i+1}, \bar{\boldsymbol{\Sigma}}_{i+1})\) |

-> (R n, Sym n) | Present filtered estimate \((\hat{\boldsymbol{x}}_{i}, \hat{\boldsymbol{\Sigma}}_{i})\) |

-> (R n, Sym n) | Present smoothed estimate \((\bar{\boldsymbol{x}}_{i}, \bar{\boldsymbol{\Sigma}}_{i})\) |

The Kalman smoothing process (sometimes also called the Rauch-Tung-Striebel smoother or RTSS) is a recursive procedure for improving previous estimates of the system state in light of more recently obtained measurements. That is, for a given state estimate that was produced (by a Kalman filter) using only the available historical data, we improve the estimate (smoothing it) using information that only became available later, via additional measurements after the estimate was originally made. The result is an estimate of the state of the system at that moment in its evolution, taking full advantage of hindsight obtained from observations of the system in the future of the current step.

Consequently, the recursive smoothing procedure progresses backwards through the state evolution, beginning with the most recent observation and updating past observations given the newer information. The update is made by measuring the deviation between what the system actually did (via observation) and what our best estimate of the system at that time predicted it would do, then adjusting the current estimate in view of this deviation.

:: KnownNat n | |

=> (a -> R n -> R n) | System evolution function \(\boldsymbol{a}_i(\boldsymbol{u}, \boldsymbol{x})\), |

-> (a -> Sym n) | Covariance matrix encoding system evolution noise \(Q_i\) |

-> a | Dynamical input \(\boldsymbol{u}_{i}\) |

-> (R n, Sym n) | Future smoothed estimate \((\bar{\boldsymbol{x}}_{i+1}, \bar{\boldsymbol{\Sigma}}_{i+1})\) |

-> (R n, Sym n) | Present filtered estimate \((\hat{\boldsymbol{x}}_{i}, \hat{\boldsymbol{\Sigma}}_{i})\) |

-> (R n, Sym n) | Present smoothed estimate \((\bar{\boldsymbol{x}}_{i}, \bar{\boldsymbol{\Sigma}}_{i})\) |

Unscented Kalman smoother