The Nonlinear Constrained Kalman-Bucy Smoother

The Nonlinear Constrained Kalman-Bucy Smoother
Syntax

[xOut, rOut, sOut, pPlusOut, pMinusOut, info] = ckbs_L1_nonlinear(


      f_fun, g_fun, h_fun, ...


      max_itr, epsilon, x_in, z, qinv, rinv, level)

Purpose
This routine minimizes the robust L1 Kalman smoother objective for general nonlinear process and measurement models (in the case where the model functions are affine, the routine ckbs_L1_affine is more efficient).

Notation
The state sequence

is defined by


 x  = x_1, \ldots, x_N

, with each


 x_i \in \B{R}^n

. The robust L1 Kalman-Bucy smoother objective is defined by


 \[


\begin{array}{rcl}


S ( x ) & = & S^Q ( x ) + S^R ( x )


\\


 S^Q ( x ) & = & \frac{1}{2}  \sum_{k=1}^N


( x_k - g_k ( x_{k-1} ) )^\R{T} * Q_k^{-1} * ( x_k - g_k ( x_{k-1} ) )


\\


S^R( x ) & = &  \sqrt{2} \sum_{k=1}^N  \|R_k^{-1/2}( z_k - h_k ( x_k  ) )\|_1


\end{array}


\]

where the matrices

R_k

and

Q_k

are symmetric positive definite and

x_0

is the constant zero. Note that

g_1

is the initial state estimate and

Q_1

is the corresponding covariance.

Problem
The robust L1 Kalman smoother problem is


 \[


{\rm minimize}\;


       S( x ) \;


       {\rm w.r.t.} \; x


\]

Optimality Conditions
Define


 d = d_1, \ldots, d_N

with each


 d_i \in \B{R}^n

. Define


 \[


\begin{array}{rcl}


\tilde{S} ( x ; d) & = & \tilde{S}^Q ( x ; d ) + \tilde{S}^R ( x


; d )


\\


 \tilde{S}^Q ( x ; d ) & = & \frac{1}{2}  \sum_{k=1}^N


( x_k - g_k ( x_{k-1} ) - \partial_k g_k ( x_{k-1} ) d_k ) ^\R{T} * Q_k^{-1} * ( x_k - g_k (


x_{k-1} ) - \partial_k g_k ( x_{k-1} ) d_k )


\\


\tilde{S}^R( x ; d ) & = &  \sqrt{2}   \sum_{k=1}^N


\|R_k^{-1/2}( z_k - h_k ( x_k  ) - \partial_k h_k ( x_k) d_k )\|_1


\end{array}


\]

A state sequence

is considered a solution if


 \[


S( x ) - \min_d\; \tilde{S} ( x ; d  )  <  \varepsilon \;.


\]

This follows directly from the convex composite structure of

.

f_fun
Constraints are current not supported in the robust L1 smoother. The syntax was left unchanged to match the constrained smoother, and because in future versions constraints may be supported.

g_fun
The ckbs_L1_nonlinear argument g_fun is a function that supports both of the following syntaxes




      [gk] = feval(g_fun, k, xk1)


      [gk, Gk] = feval(g_fun, k, xk1)

k
The g_fun argument k is an integer with


 1 \leq k \leq N

. The case


 k = 1

serves to specify the initial state estimate.

xk1
The g_fun argument xk1 is an column vector with length

. It specifies the state vector at time index k


 \[


      xk1 = x_{k-1}


\]

. In the case


 k = 1

, the value of xk1 does not matter.

gk
The g_fun result gk is an column vector with length

and


 \[


      gk = g_k ( x_{k-1} )


\]

In the case


 k = 1

, the value of gk is the initial state estimate at time index k .

Gk
If the g_fun result Gk is present in the syntax, it is the


 n \times n

matrix given by and


 \[


      Gk = \partial_{k-1} g_k ( x_{k-1} )


\]

In the case


 k = 1

, the value of Gk is the zero matrix; i.e.,

g_k

does not depend on the value of


 x_{k-1}

.

h_fun
The ckbs_L1_nonlinear argument h_fun is a function that supports both of the following syntaxes




      [hk] = feval(h_fun, k, xk)


      [hk, Hk] = feval(h_fun, k, xk)

k
The h_fun argument k is an integer with


 1 \leq k \leq N

.

xk
The h_fun argument xk is an column vector with length

. It specifies the state vector at time index k


 \[


      xk = x_k


\]

.

hk
The h_fun result hk is an column vector with length

and


 \[


      hk = h_k ( x_k )


\]

Hk
If the h_fun result Hk is present in the syntax, it is the


 m \times n

matrix given by and


 \[


      Hk = \partial_k h_k ( x_k )


\]

max_itr
The integer scalar max_itr specifies the maximum number of iterations of the algorithm to execute. It must be greater than or equal to zero. Note that if it is zero, the first row of the info return value will still be computed. This can be useful for deciding what is a good value for the argument epsilon .

epsilon
The ckbs_L1_nonlinear argument epsilon is a positive scalar. It specifies the convergence criteria value; i.e.,


 \[


      \varepsilon = epsilon


\]

x_in
The ckbs_L1_nonlinear argument x_in is a two dimensional array with size


 n \times N

. It specifies a sequence of state values; i.e., for


 k = 1 , \ldots , N


 \[


      x\_in (:, k) = x_k


\]

The closer the initial state sequence is to the solution the faster, and more likely, the ckbs_L1_nonlinear will converge.

z
The ckbs_L1_nonlinear argument z is a two dimensional array with size


 m \times N

. It specifies the sequence of measurement vectors; i.e., for


 k = 1 , \ldots , N


 \[


      z(:, k) = z_k


\]

qinv
The ckbs_L1_nonlinear argument qinv is a three dimensional array with size


 n \times n \times N

. It specifies the inverse of the variance of the measurement noise; i.e., for


 k = 1 , \ldots , N


 \[


      qinv(:,:, k) = Q_k^{-1}


\]

In the case


 k = 1

, the value of

Q_k

is the variance of the initial state estimate (see g_fun .

rinv
The ckbs_L1_nonlinear argument rinv is a three dimensional array, with size


 m \times m \times N

. it specifies the inverse of the variance of the transition noise; i.e., for


 k = 1 , \ldots , N


 \[


      rinv(:,:, k) = R_k^{-1}


\]

It is ok to signify a missing data value by setting the corresponding row and column of rinv to zero. This also enables use to interpolate the state vector

x_k

to points where there are no measurements.

level
The integer scalar level specifies the level of tracing to do. If level == 0 , no tracing is done. If level >= 1 , the row index q and the correspond row of info are printed as soon as soon as they are computed. If level >= 2 , a check of the derivative calculations is printed before the calculations start. In this case, control will return to the keyboard so that you can print values, continue, or abort the calculation.

x_out
The result x_out is a two dimensional array with size


 n \times N


 ( x_1 , \ldots , x_N )

. It contains an approximation for the optimal sequence; i.e., for


 k = 1 , \ldots , N


 \[


      x\_out(:, k) = x_k


\]

rOut
The result rOut contains the structure rOut reported by ckbs_L1_affine for the last affine subproblem.

sOut
The result sOut contains the structure sOut reported by ckbs_L1_affine for the last affine subproblem.

pPlusOut
The result pPlusOut contains the structure pPlusOut reported by ckbs_L1_affine for the last affine subproblem.

pMinusOut
The result pMinusOut contains the structure pMinusOut reported by ckbs_L1_affine for the last affine subproblem.

info
The result info is a matrix with each row corresponding to a solution of ckbs_L1_affine on an affine subproblem. There are five numbers output per iteration: the infinity norm of the Kuhn-Tucker conditions, the one-norm of the Kuhn-Tucker conditions, the affine improvement,

\mu

, and the number of line search iterations. See ckbs_L1_affine for the definitions of these quantities. Note that ckbs_L1_affine has satisfied the convergence condition if the affine improvement is close to zero, that is,




       info(end, 3) <= epsilon

Example
The file L1_nonlinear_ok.m contains an example and test of ckbs_L1_nonlinear estimating the position of a Van Der Pol oscillator (an oscillator with non-linear dynamics). The code runs both ckbs_nonlinear and ckbs_L1_nonlinear on the example, in cases where there are outliers in the data, and can be used to make a plot. It returns true if ckbs_L1_nonlinear converged on the problem, and false otherwise.

Input File: src/ckbs_L1_nonlinear.m