|   | Previous | Next | 
[ds, dy, du] = ckbs_newton_step(
      mu, s, y, u, b, d, Bdia, Hdia, Hlow)
 \mu
-relaxed affine constrained Kalman-Bucy smoother problem.
 \mu \in \B{R}_+
,
 H \in \B{R}^{p \times p}
,
 d \in \B{R}^p
,
 b \in \B{R}^r
, and
 B \in \B{R}^{r \times p}
,
the 
 \mu
-relaxed affine constrained Kalman-Bucy smoother problem is:
 \[
\begin{array}{rl}
{\rm minimize} & \frac{1}{2} y^\R{T} H y + d^\R{T} y
- \mu \sum_{i=1}^r \log(s_i)
\; {\rm w.r.t} \; y \in \B{R}^p \; , \; s \in \B{R}_+^r
\\
{\rm subject \; to} & s + b + B y  = 0
\end{array}
\] 
In addition, 
 H
 is symmetric block tri-diagonal with each block of
size 
 n \times n
 and
 B
 is block diagonal with each block of size 
 m \times n
(there is an integer 
 N
 such that
 p = n * N
 and 
 r = m * N
).
 u \in \B{R}^r
to denote the Lagrange multipliers corresponding to the constraint equation.
The corresponding Lagrangian is
 \[
L(y, s, u)  =
\frac{1}{2} y^\R{T} H y + d^\R{T} y
- \mu \sum_{i=1}^r \log(s_i)
+ u^\R{T} (s + b + B y)
\] 
The partial gradients of the Lagrangian are given by
 \[
\begin{array}{rcl}
\nabla_y L(y, s, u ) & = & H y + B^\R{T} u + d  \\
\nabla_s L(y, s, u ) & = & u - \mu / s \\
\nabla_u L(y, s, u ) & = & s + b + B y \\
\end{array}
\] 
where 
 \mu / s 
 is the component by component division of
 \mu 
 by the components of the 
 s
.
Note, from the second equation, that we only need consider
 u \geq 0
 because 
 s \geq 0
.
 D(s)
to denote the diagonal matrix with 
 s
along its diagonal and 
 1_r
to denote the vector, of length 
 r
 with all its components
equal to one.
We define
 F : \B{R}^{r + p + r} \rightarrow \B{R}^{r + p + r}
by
 \[
F(s, u, y)
=
\left(
\begin{array}{c}
s + b + B y       \\
D(s) D(u) 1_r - \mu 1_r\\
H y + B^\R{T} u + d
\end{array}
\right)
\] 
The Kuhn-Tucker conditions for a solution of the
 \mu
-relaxed constrained affine Kalman-Bucy smoother problem is
 F(s, u, y) = 0 
.
 (s, u, y)
, the Newton step
 ( \Delta s^\R{T} , \Delta u^\R{T} , \Delta y^\R{T} )^\R{T}
 solves the problem:
 \[
F^{(1)} (s, u, y)
\left( \begin{array}{c} \Delta s \\ \Delta u \\ \Delta y \end{array} \right)
=
- F(s, u, y)
\] 
mu
 is a positive scalar specifying the
relaxation parameter 
 \mu
.
s
 is a column vector of length 
 r
.
All the elements of 
s
 are greater than zero.
y
 is a column vector of length 
 p
u
 is a column vector of length 
 r
.
All the elements of 
s
 are greater than zero.
b
 is a column vector of length 
 r
.
d
 is a column vector of length 
 p
Bdia
 is an 
 m \times n \times N
 array.
For 
 k = 1 , \ldots , N
 we define
 B_k \in \B{R}^{m \times n}
 by
 \[
      B_k = Bdia(:, :, k)
\] 
 B
 is defined by
 \[
B
=
\left( \begin{array}{cccc}
B_1 & 0      & 0      &           \\
0   & B_2    & 0      & 0         \\
0   & 0      & \ddots & 0         \\
    & 0      & 0      & B_N
\end{array} \right)
\] 
Hdia
 is an 
 n \times n \times N
 array.
For 
 k = 1 , \ldots , N
 we define
 H_k \in \B{R}^{n \times n}
 by
 \[
      H_k = Hdia(:, :, k)
\] 
Hlow
 is an 
 n \times n \times N
 array.
For 
 k = 1 , \ldots , N
 we define
 L_k \in \B{R}^{n \times n}
 by
 \[
      L_k = Hlow(:, :, k)
\] 
 H
 is defined by
 \[
H
=
\left( \begin{array}{cccc}
H_1 & L_2^\R{T} & 0      &           \\
L_2 & H_2    & L_3^\R{T} & 0         \\
0   & \ddots & \ddots & \ddots    \\
    & 0      & L_N    & H_N
\end{array} \right)
\] 
ds
 is a column vector of length 
 r
equal to the 
 \Delta s
 components of the Newton step.
dy
 is a column vector of length 
 p
equal to the 
 \Delta y
 components of the Newton step.
du
 is a column vector of length 
 r
equal to the 
 \Delta u
 components of the Newton step.
ckbs_newton_step.
It returns true if ckbs_newton_step passes the test
and false otherwise.
 F
 is given by
 \[
F^{(1)} (s, y, u) =
\left(
\begin{array}{ccccc}
D( 1_r ) &  0  & B  \\
D( u )   & 0   & D(s) \\
0        & B^\R{T} & H
\end{array}
\right)
\] 
It follows that
 \[
\left(\begin{array}{ccc}
I & 0 & B \\
U & S & 0\\
0 & B^T & C \\
\end{array}\right)
\left(\begin{array}{ccc}
\Delta s \\ \Delta y \\ \Delta u
\end{array}\right)
=
-
\left(\begin{array}{ccc}
s + b + By \\
SU\B{1} - \mu\B{1}\\
Cy + B^Tu + d
\end{array}\right)\;.
\] 
Below, 
 r_i
 refers to row 
 i
.
Applying the row operations
 \[
\begin{array}{ccc}
r_2 &=& r_2 - U r_1 \\
r_3 &=& r_3 - B^T S^{-1} r_2
\end{array}\;,
\] 
we obtain the equivalent system
 \[
\left(\begin{array}{ccc}
I & 0 & B \\
0 & S & -UB\\
0 & 0 & C + B^T S^{-1} U B
\end{array}\right)
\left(\begin{array}{ccc}
\Delta s \\ \Delta y \\ \Delta u
\end{array}\right)
=
-
\left(\begin{array}{ccc}
s + b + B y \\
-U(b + B y) - \mu\B{1}\\
Cy + B^T u + d + B^T S^{-1}
\left(U(b + B y) + \mu \B{1}
\right)
\end{array}\right)\;.
\] 
 \Delta y
 is obtained
from a single block tridiagonal solve (see third row of system).
Then we immediately have
 \[
\Delta u = US^{-1}(b + B(y + \Delta y)) + \frac{\mu}{s}
\] 
and
 \[
\Delta s = -s -b -B(y + \Delta y)\;.
\] 
.