\newcommand{\R}[1]{{\rm #1}}
\newcommand{\B}[1]{{\bf #1}}
One web page per Section | All as one web page | |
Math in Latex
| ckbs.htm | _printable.htm |
Math in MathML
| ckbs.xml | _priintable.xml |
http://www.seanet.com/~bradbell/packages.htm
http://www.coin-or.org/download/source/CoinBazaar
Theses releases are named
ckbs-0.yyyymmdd.r.ext
where
yyyy
is the year,
mm
is the month,
dd
is the day,
r
is the release number,
and
ext
is tgz
or zip
.
If
ext
is tgz
the command
tar -xvzf ckbs-0.yyyymmdd.r.tgz
will create the
ckbs-0.yyyymmdd.r
directory.
If
ext
is zip
the command
unzip ckbs-0.yyyymmdd.r.zip
will create the
ckbs-0.yyyymmdd.r
directory.
ckbs
,
start with the following commands:
mkdir ckbs
cd ckbs
web="https://projects.coin-or.org/svn/CoinBazaar/projects/ckbs"
For each release listed in the
download directory
(http://www.coin-or.org/download/source/CoinBazaar)
you can use subversion to download the release with the commands
svn checkout $web/releases/0.yyyymmdd.r
This will create the
ckbs/0.yyyymmdd.r
directory.
You can also download the most recent version of ckbs (called the trunk)
using the commands
svn checkout $web/trunk
This will create the
ckbs/trunk
directory.
ckbs/0.yyyymmdd.r/example
or
ckbs/trunk/example
and use Matlab® or Octave to run the program all_ok.m
.
It will print the test results for all the components of ckbs
.
You can run some more tests by changing into the directory
ckbs/0.yyyymmdd.r/test
or
ckbs/trunk/test
and running the program all_ok.m
which will have a similar output.
ckbs/0.yyyymmdd.r/doc/ckbs.xml
If you download the latest version (the trunk),
a copy of the documentation for this version can
be built by executing the commands
./build_doc.sh
in the
ckbs/trunk
directory.
You can then view the documentation in your browser by opening the file
ckbs/trunk/doc/ckbs.xml
_contents: 1 | Table of Contents |
license: 2 | Your License to use the ckbs Software |
ckbs_t_general: 3 | The General Student's t Smoother |
ckbs_nonlinear: 4 | The Nonlinear Constrained Kalman-Bucy Smoother |
ckbs_L1_nonlinear: 5 | The Nonlinear Constrained Kalman-Bucy Smoother |
ckbs_affine: 6 | Constrained Affine Kalman Bucy Smoother |
ckbs_affine_singular: 7 | Singular Affine Kalman Bucy Smoother |
ckbs_L1_affine: 8 | Robust L1 Affine Kalman Bucy Smoother |
utility: 9 | ckbs Utility Functions |
all_ok.m: 10 | Run All Correctness Tests |
whatsnew: 11 | Changes and Additions to ckbs |
wishlist: 12 | List of Future Improvements to ckbs |
bib: 13 | Bibliography |
_reference: 14 | Alphabetic Listing of Cross Reference Tags |
_index: 15 | Keyword Index |
_external: 16 | External Internet References |
ckbs-0.20130204.0: Constrained/Robust Kalman-Bucy Smoothers: : ckbs Table of Contents: 1: _contents Your License to use the ckbs Software: 2: license The General Student's t Smoother: 3: ckbs_t_general ckbs_t_general Jump Tracking Example and Test: 3.1: t_general_ok.m ckbs_t_general Jump Tracking Example and Test: 3.2: t_general_noisy_jump.m The Nonlinear Constrained Kalman-Bucy Smoother: 4: ckbs_nonlinear ckbs_nonlinear: A Simple Example and Test: 4.1: get_started_ok.m ckbs_nonlinear: Example and Test of Tracking a Sine Wave: 4.2: sine_wave_ok.m ckbs_nonlinear: Unconstrained Nonlinear Transition Model Example and Test: 4.3: vanderpol_ok.m Van der Pol Oscillator Simulation (No Noise): 4.3.1: vanderpol_sim Example Use of vanderpol_sim: 4.3.1.1: vanderpol_sim_ok.m ckbs_nonlinear: Vanderpol Transition Model Mean Example: 4.3.2: vanderpol_g.m ckbs_nonlinear: General Purpose Utilities: 4.4: nonlinear_utility ckbs_nonlinear: Example of Box Constraints: 4.4.1: box_f.m ckbs_nonlinear: Example Direct Measurement Model: 4.4.2: direct_h.m ckbs_nonlinear: Example of Distance Measurement Model: 4.4.3: distance_h.m ckbs_nonlinear: Example of No Constraint: 4.4.4: no_f.m ckbs_nonlinear: Example of Persistence Transition Function: 4.4.5: persist_g.m ckbs_nonlinear: Example Position and Velocity Transition Model: 4.4.6: pos_vel_g.m ckbs_nonlinear: Example of Nonlinear Constraint: 4.4.7: sine_f.m The Nonlinear Constrained Kalman-Bucy Smoother: 5: ckbs_L1_nonlinear ckbs_L1_nonlinear: Robust Nonlinear Transition Model Example and Test: 5.1: L1_nonlinear_ok.m Constrained Affine Kalman Bucy Smoother: 6: ckbs_affine ckbs_affine Box Constrained Smoothing Spline Example and Test: 6.1: affine_ok_box.m Singular Affine Kalman Bucy Smoother: 7: ckbs_affine_singular ckbs_affine_singular Singular Smoothing Spline Example and Test: 7.1: affine_singular_ok.m Robust L1 Affine Kalman Bucy Smoother: 8: ckbs_L1_affine ckbs_L1_affine Robust Smoothing Spline Example and Test: 8.1: L1_affine_ok.m ckbs Utility Functions: 9: utility Student's t Sum of Squares Objective: 9.1: ckbs_t_obj ckbs_t_obj Example and Test: 9.1.1: t_obj_ok.m Student's t Gradient: 9.2: ckbs_t_grad ckbs_t_grad Example and Test: 9.2.1: t_grad_ok.m Student's t Hessian: 9.3: ckbs_t_hess ckbs_t_hess Example and Test: 9.3.1: t_hess_ok.m Block Diagonal Algorithm: 9.4: ckbs_diag_solve ckbs_diag_solve Example and Test: 9.4.1: diag_solve_ok.m Block Bidiagonal Algorithm: 9.5: ckbs_bidiag_solve ckbs_bidiag_solve Example and Test: 9.5.1: bidiag_solve_ok.m Block Bidiagonal Algorithm: 9.6: ckbs_bidiag_solve_t ckbs_bidiag_solve_t Example and Test: 9.6.1: bidiag_solve_t_ok.m Packed Block Bidiagonal Matrix Symmetric Multiply: 9.7: ckbs_blkbidiag_symm_mul blkbidiag_symm_mul Example and Test: 9.7.1: blkbidiag_symm_mul_ok.m Packed Block Diagonal Matrix Times a Vector or Matrix: 9.8: ckbs_blkdiag_mul blkdiag_mul Example and Test: 9.8.1: blkdiag_mul_ok.m Transpose of Packed Block Diagonal Matrix Times a Vector or Matrix: 9.9: ckbs_blkdiag_mul_t blkdiag_mul_t Example and Test: 9.9.1: blkdiag_mul_t_ok.m Packed Lower Block Bidiagonal Matrix Transpose Times a Vector: 9.10: ckbs_blkbidiag_mul_t blkbidiag_mul_t Example and Test: 9.10.1: blkbidiag_mul_t_ok.m Packed Lower Block Bidiagonal Matrix Times a Vector: 9.11: ckbs_blkbidiag_mul blkbidiag_mul Example and Test: 9.11.1: blkbidiag_mul_ok.m Packed Block Tridiagonal Matrix Times a Vector: 9.12: ckbs_blktridiag_mul blktridiag_mul Example and Test: 9.12.1: blktridiag_mul_ok.m Affine Residual Sum of Squares Objective: 9.13: ckbs_sumsq_obj ckbs_sumsq_obj Example and Test: 9.13.1: sumsq_obj_ok.m Affine Least Squares with Robust L1 Objective: 9.14: ckbs_L2L1_obj ckbs_L2L1_obj Example and Test: 9.14.1: L2L1_obj_ok.m Affine Residual Sum of Squares Gradient: 9.15: ckbs_sumsq_grad ckbs_sumsq_grad Example and Test: 9.15.1: sumsq_grad_ok.m Affine Residual Process Sum of Squares Gradient: 9.16: ckbs_process_grad ckbs_process_grad Example and Test: 9.16.1: process_grad_ok.m Affine Residual Sum of Squares Hessian: 9.17: ckbs_sumsq_hes ckbs_sumsq_hes Example and Test: 9.17.1: sumsq_hes_ok.m Affine Process Residual Sum of Squares Hessian: 9.18: ckbs_process_hes ckbs_process_hes Example and Test: 9.18.1: process_hes_ok.m Symmetric Block Tridiagonal Algorithm: 9.19: ckbs_tridiag_solve ckbs_tridiag_solve Example and Test: 9.19.1: tridiag_solve_ok.m Symmetric Block Tridiagonal Algorithm (Backward version): 9.20: ckbs_tridiag_solve_b ckbs_tridiag_solve_b Example and Test: 9.20.1: tridiag_solve_b_ok.m Symmetric Block Tridiagonal Algorithm (Conjugate Gradient version): 9.21: ckbs_tridiag_solve_pcg ckbs_tridiag_solve_pcg Example and Test: 9.21.1: tridiag_solve_pcg_ok.m Affine Constrained Kalman Bucy Smoother Newton Step: 9.22: ckbs_newton_step ckbs_newton_step Example and Test: 9.22.1: newton_step_ok.m Affine Robust L1 Kalman Bucy Smoother Newton Step: 9.23: ckbs_newton_step_L1 ckbs_newton_step_L1 Example and Test: 9.23.1: newton_step_L1_ok.m Compute Residual in Kuhn-Tucker Conditions: 9.24: ckbs_kuhn_tucker ckbs_kuhn_tucker Example and Test: 9.24.1: kuhn_tucker_ok.m Compute Residual in Kuhn-Tucker Conditions for Robust L1: 9.25: ckbs_kuhn_tucker_L1 ckbs_kuhn_tucker_L1 Example and Test: 9.25.1: kuhn_tucker_L1_ok.m Run All Correctness Tests: 10: all_ok.m Set Up Path for Testing: 10.1: test_path.m Changes and Additions to ckbs: 11: whatsnew List of Future Improvements to ckbs: 12: wishlist Bibliography: 13: bib Alphabetic Listing of Cross Reference Tags: 14: _reference Keyword Index: 15: _index External Internet References: 16: _external
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License.
[x_out, info] = ckbs_t_general(
g_fun, h_fun, ...
max_itr, epsilon, x_in, z, qinv, rinv, params)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\rho_k^p\left([ z_k - h_k ( x_k ) ]^\R{T} * R_k^{-1} * [ z_k - h_k ( x_k
) ]\right)
\\
& + &
\rho_k^m\left([ x_k - g_k ( x_{k-1} ) ]^\R{T} * Q_k^{-1} * [ x_k - g_k (
x_{k-1} ) ]\right)
\\
\end{array}
\]
where
\rho_k^p
and
\rho_k^m
are either quadratic, Student's t,
or component-wise mixtures of these functions,
depending on user-specified indices. The matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Note that the process model
g_fun
should provide an initial state estimate for
k=1
,
and
Q_1
is the corresponding covariance.
\[
\begin{array}{rll}
{\rm minimize}
& S( x_1 , \ldots , x_N )
& {\rm w.r.t.} \; x_1 \in \B{R}^n , \ldots , x_N \in \B{R}^n
\end{array}
\]
ckbs_nonlinear
argument
g_fun
is a function that supports both of
the following syntaxes
[gk] = feval(g_fun, k, xk1)
[gk, Gk] = feval(g_fun, k, xk1)
g_fun
argument
k
is an integer with
1 \leq k \leq N
.
The case
k = 1
serves to specify the initial state estimate.
g_fun
argument
xk1
is an column vector with
length
n
. It specifies the state vector at time index
k
\[
xk1 = x_{k-1}
\]
.
In the case
k = 1
, the value of
xk1
does not matter.
g_fun
result
gk
is an column vector with
length
n
and
\[
gk = g_k ( x_{k-1} )
\]
In the case
k = 1
, the value of
gk
is the initial state
estimate at time index
k
.
g_fun
result
Gk
is present in the syntax,
it is the
n \times n
matrix given by
and
\[
Gk = \partial_{k-1} g_k ( x_{k-1} )
\]
In the case
k = 1
, the value of
Gk
is the zero matrix;
i.e.,
g_k
does not depend on the value of
x_{k-1}
.
ckbs_nonlinear
argument
h_fun
is a function that supports both of the
following syntaxes
[hk] = feval(h_fun, k, xk)
[hk, Hk] = feval(h_fun, k, xk)
h_fun
argument
k
is an integer with
1 \leq k \leq N
.
h_fun
argument
xk
is an column vector with
length
n
. It specifies the state vector at time index
k
\[
xk = x_k
\]
.
h_fun
result
hk
is an column vector with
length
m
and
\[
hk = h_k ( x_k )
\]
h_fun
result
Hk
is present in the syntax,
it is the
m \times n
matrix given by
and
\[
Hk = \partial_k h_k ( x_k )
\]
max_itr
specifies the maximum number of
iterations of the algorithm to execute. It must be greater than or
equal to zero. Note that if it is zero, the first row of the
4.r: info
return value will still be computed.
This can be useful for deciding what is a good value for the argument
4.j: epsilon
.
ckbs_nonlinear
argument
epsilon
is a positive scalar.
It specifies the convergence
criteria value; i.e.,
\[
\varepsilon = epsilon
\]
ckbs_nonlinear
argument
x_in
is a two dimensional array with size
n \times N
.
It specifies a sequence of state values; i.e.,
for
k = 1 , \ldots , N
\[
x\_in (:, k) = x_k
\]
The closer the initial state sequence is to the solution
the faster, and more likely, the ckbs_nonlinear
will converge.
The initial state sequence need not be feasible; i.e.
it is not necessary that
\[
f_k ( x_k ) \leq 0
\]
for all
k
.
ckbs_nonlinear
argument
z
is a two dimensional array
with size
m \times N
.
It specifies the sequence of measurement vectors; i.e.,
for
k = 1 , \ldots , N
\[
z(:, k) = z_k
\]
ckbs_nonlinear
argument
qinv
is a three dimensional array
with size
n \times n \times N
.
It specifies the inverse of the variance of the measurement noise; i.e.,
for
k = 1 , \ldots , N
\[
qinv(:,:, k) = Q_k^{-1}
\]
In the case
k = 1
, the value of
Q_k
is the variance
of the initial state estimate (see 4.g: g_fun
.
ckbs_nonlinear
argument
rinv
is a three dimensional array,
with size
m \times m \times N
.
it specifies the inverse of the variance of the transition noise; i.e.,
for
k = 1 , \ldots , N
\[
rinv(:,:, k) = R_k^{-1}
\]
It is ok to signify a missing data value by setting the corresponding
row and column of
rinv
to zero. This also enables use
to interpolate the state vector
x_k
to points where
there are no measurements.
params
structure specifies auxiliary parameters required by
the smoother. The structure is passed to the objective, gradient, and
hessian functions; some parameter values are required for these to work
correctly (see below).
level
argument sets the output level. If
level
is greater than 1, numerical derivatives are computed.
dif_proc
is the Student's t degree of freedom parameter used
for any Student's t process component.
dif_meas
is the Student's t degree of freedom parameter used
for any Student's t measurement component.
inds_proc_st
lists indicies of the state (process residual)
that are to be modeled using Student's t. Setting this structure to empty
[]
means quadratic model is used for entire state.
inds_meas_st
lists indicies of the measurement residual
that are to be modeled using Student's t. Setting this structure to empty
[]
means quadratic model is used for entire measurement.
x_out
is a two dimensional array with size
n \times N
.
( x_1 , \ldots , x_N )
.
It contains an approximation for the optimal sequence; i.e.,
for
k = 1 , \ldots , N
\[
x\_out(:, k) = x_k
\]
info
is a matrix with each row corresponding
to an iteration of the algorithm.
Note that ckbs_nonlinear
has satisfied the convergence condition if
and only if
all( info(end, 1) <= epsilon )
or
all( info(end, 3) <= epsilon )
ckbs_t_general
.
It returns true if ckbs_t_general
passes the test
(meaning algorithm converges to stationary point)
and false otherwise.
[ok] = affine_ok(draw_plot)
ok
is true, the test passed,
otherwise the test failed.
x1 (t)
| derivative of function we are estimating |
x2 (t)
| value of function we are estimating |
z1 (t)
value of
x2 (t)
plus noise
function [ok]= t_general_ok(draw_plot, conVar, conFreq, noise)
% --------------------------------------------------------
N = 100; % number of measurement time points
dt = 4*pi / N; % time between measurement points
%dt = 1;
gamma = 1; % transition covariance multiplier
sigma = 0.5; % standard deviation of measurement noise
if (strcmp(noise,'student'))
sigma = sqrt(2);
end
max_itr = 100; % maximum number of iterations
epsilon = 1e-3; % convergence criteria
h_min = 0; % minimum horizontal value in plots
h_max = 14; % maximum horizontal value in plots
v_min = -10.0; % minimum vertical value in plots
v_max = 10.0; % maximum vertical value in plots
% ---------------------------------------------------------
if nargin < 1
draw_plot = false;
end
% number of measurements per time point
m = 1;
% number of state vector components per time point
n = 2;
params.level = 1;
% simulate the true trajectory and measurement noise
t1 = (1 : N/2 - 2) * dt;
t2 = (N/2 - 1: N/2 + 1)*dt;
t3 = (N/2 +2 : N)*dt;
t = [t1, t2, t3];
% alpha = 8;
% beta = 4;
% gammaOne = 20;
jump = 10;
derivs = jump/(4*dt) * ones(1,3);
t2 = [-cos(t1), derivs, -cos(t3)];
t1 = [-jump/2-sin(t1), -jump/2-sin(N/2 - 2) + cumsum(derivs)*dt, jump/2-sin(t3)];
x1_true = t1(1:N);
x2_true = t2(1:N);
x_true = [ x1_true ; x2_true ];
%
params.direct_h_index = 1;
h_fun = @(k,x) direct_h(k,x,params);
params.pos_vel_g_dt = .01;
params.pos_vel_g_initial = x_true(:,1); % initialize to true values for now
g_fun = @(k,x) pos_vel_g(k,x,params);
df_meas = 4;
df_proc = 4;
params.df_proc = df_proc;
params.df_meas = df_meas;
params.inds_proc_st = 1:2;
params.inds_meas_st = [];
% measurement values and model
%exp_true = random('exponential', 1/sigma, 1, N);
%bin = 2*random('binomial', 1, 0.5, 1, N) - 1;
%exp_true = exp_true .* bin;
%z = x2_true + exp_true;
% Normal Noise
v_true = zeros(1,N);
%temp = zeros(1,N);
%bin = zeros(1,N);
randn('state', sum(100*clock))
rand('twister', sum(100*clock));
if (strcmp(noise, 'normal'))
v_true = sigma * randn(1, N);
end
%Laplace Noise
if (strcmp(noise,'laplace'))
temp = sigma*exprnd(1/sigma, 1,100);
bin = 2*(binornd(1, 0.5, 1, 100)-0.5);
v_true = temp.*bin/sqrt(2);
end
if (strcmp(noise,'student'))
v_true = trnd(4,1,100);
end
nI= binornd(1, conFreq, 1, N);
newNoise = sqrt(conVar)*randn(1,N);
z = x1_true + v_true + nI.*newNoise;
rk = sigma * sigma;
rinvk = 1 / rk;
rinv = zeros(m, m, N);
for k = 1 : N
if (mod(k, 2) == 0)
rinv(:, :, k) = rinvk;
end
end
% transition model
qinv = zeros(n, n, N);
qk = gamma * [ dt , dt^2/2 ; dt^2/2 , dt^3/3 ];
%qk = gamma * [ dt^3/3 , dt^2/2 ; dt^2/2 , dt ];
qinvk = inv(qk);
for k = 2 : N
qinv(:,:, k) = qinvk;
end
qinv(:,:,1) = 100*eye(n);
%
xZero = zeros(n, N);
% L1 Kalman-Bucy Smoother
w = 1;
[xOut, info] = ckbs_t_general(g_fun, h_fun, max_itr, epsilon, xZero, z, qinv, w*rinv, params);
ok = info(end, 3) < epsilon || info(1) < 1e-8;
% % --------------------------------------------------------------------------
%
if draw_plot
figure(1);
clf
hold on
plot(t', x_true(1,:)', '-k', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
meas = z(1,:);
meas(meas > v_max) = v_max;
meas(meas < v_min) = v_min;
plot(t', meas, 'bd', 'MarkerFaceColor', [.49 1 .63], 'MarkerSize', 6);
%zWrong = z.*nI;
%i = 0;
%for(i = 1:N)
% if (zWrong(1,i) ~=0)
% plot(t(i)', zWrong(1,i)', 'rs', 'MarkerFaceColor', [.49 1 .63], 'MarkerSize', 4);
% end
%end
%plot(t', xOut(2,:)', 'r-' );
%plot(t', xfilt(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
plot(t', xOut(1,:)', '-m', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
% plot(t', xOut2(2,:)', '-r', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', xOut3(2,:)', '-y', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', xOut4(2,:)', '-g', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', xOut5(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
%plot(t', xsmooth(2,:)', '-m', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
%plot(t', xfilt(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', - ones(N,1), 'b-');
% plot(t', ones(N,1), 'b-');
% xlabel('Time', 14);
% ylabel('Position', 'Fontsize', 14);
% set(gca,'FontSize',14);
axis([h_min, h_max, v_min, v_max]);
hold off
end
[ok] = affine_ok(draw_plot, conVar, conFreq, proc_ind,
meas_ind)
ok
is true, the test passed,
otherwise the test failed.
x1 (t)
| derivative of function we are estimating |
x2 (t)
| value of function we are estimating |
z1 (t)
value of
x2 (t)
plus noise
function [ok]= t_general_noisy_jump(draw_plot, conVar, conFreq, proc_ind, meas_ind)
noise = 'normal';
if(nargin < 5)
conFreq = 0;
conVar = 0;
proc_ind = 1:2;
meas_ind = [];
end
% --------------------------------------------------------
N = 100; % number of measurement time points
dt = 4*pi / N; % time between measurement points
%dt = 1;
gamma = 1; % transition covariance multiplier
sigma = 0.5; % standard deviation of measurement noise
if (strcmp(noise,'student'))
sigma = sqrt(2);
end
max_itr = 100; % maximum number of iterations
epsilon = 1e-3; % convergence criteria
h_min = 0; % minimum horizontal value in plots
h_max = 14; % maximum horizontal value in plots
v_min = -10.0; % minimum vertical value in plots
v_max = 10.0; % maximum vertical value in plots
% ---------------------------------------------------------
if nargin < 1
draw_plot = false;
end
% number of measurements per time point
m = 1;
% number of state vector components per time point
n = 2;
params.level = 1;
% simulate the true trajectory and measurement noise
t1 = (1 : N/2 - 2) * dt;
t2 = (N/2 - 1: N/2 + 1)*dt;
t3 = (N/2 +2 : N)*dt;
t = [t1, t2, t3];
% alpha = 8;
% beta = 4;
% gammaOne = 20;
jump = 10;
derivs = jump/(4*dt) * ones(1,3);
t2 = [-cos(t1), derivs, -cos(t3)];
t1 = [-jump/2-sin(t1), -jump/2-sin(N/2 - 2) + cumsum(derivs)*dt, jump/2-sin(t3)];
x1_true = t1(1:N);
x2_true = t2(1:N);
x_true = [ x1_true ; x2_true ];
%
params.direct_h_index = 1;
h_fun = @(k,x) direct_h(k,x,params);
params.pos_vel_g_dt = .01;
params.pos_vel_g_initial = x_true(:,1); % initialize to true values for now
g_fun = @(k,x) pos_vel_g(k,x,params);
df_meas = 4;
df_proc = 4;
params.df_proc = df_proc;
params.df_meas = df_meas;
params.inds_proc_st = proc_ind;
params.inds_meas_st = meas_ind;
% measurement values and model
%exp_true = random('exponential', 1/sigma, 1, N);
%bin = 2*random('binomial', 1, 0.5, 1, N) - 1;
%exp_true = exp_true .* bin;
%z = x2_true + exp_true;
% Normal Noise
v_true = zeros(1,N);
%temp = zeros(1,N);
%bin = zeros(1,N);
randn('state', sum(100*clock))
rand('twister', sum(100*clock));
if (strcmp(noise, 'normal'))
v_true = sigma * randn(1, N);
end
%Laplace Noise
if (strcmp(noise,'laplace'))
temp = sigma*exprnd(1/sigma, 1,100);
bin = 2*(binornd(1, 0.5, 1, 100)-0.5);
v_true = temp.*bin/sqrt(2);
end
if (strcmp(noise,'student'))
v_true = trnd(4,1,100);
end
nI= binornd(1, conFreq, 1, N);
newNoise = sqrt(conVar)*randn(1,N);
z = x1_true + v_true + nI.*newNoise;
rk = sigma * sigma;
rinvk = 1 / rk;
rinv = zeros(m, m, N);
for k = 1 : N
if (mod(k, 2) == 0)
rinv(:, :, k) = rinvk;
end
end
% transition model
qinv = zeros(n, n, N);
qk = gamma * [ dt , dt^2/2 ; dt^2/2 , dt^3/3 ];
%qk = gamma * [ dt^3/3 , dt^2/2 ; dt^2/2 , dt ];
qinvk = inv(qk);
for k = 2 : N
qinv(:,:, k) = qinvk;
end
qinv(:,:,1) = 100*eye(n);
%
xZero = zeros(n, N);
% L1 Kalman-Bucy Smoother
w = 1;
[xOut, info] = ckbs_t_general(g_fun, h_fun, max_itr, epsilon, xZero, z, qinv, w*rinv, params);
ok = info(end, 3) < epsilon || info(1) < 1e-8;
% % --------------------------------------------------------------------------
%
if draw_plot
figure(1);
clf
hold on
plot(t', x_true(1,:)', '-k', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
meas = z(1,:);
meas(meas > v_max) = v_max;
meas(meas < v_min) = v_min;
rvec = rinv(1,:);
meas(rvec==0) = v_max + 5; % take non-measurements out
plot(t', meas, 'bd', 'MarkerFaceColor', [.49 1 .63], 'MarkerSize', 6);
%zWrong = z.*nI;
%i = 0;
%for(i = 1:N)
% if (zWrong(1,i) ~=0)
% plot(t(i)', zWrong(1,i)', 'rs', 'MarkerFaceColor', [.49 1 .63], 'MarkerSize', 4);
% end
%end
%plot(t', xOut(2,:)', 'r-' );
%plot(t', xfilt(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
plot(t', xOut(1,:)', '-m', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
% plot(t', xOut2(2,:)', '-r', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', xOut3(2,:)', '-y', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', xOut4(2,:)', '-g', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', xOut5(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
%plot(t', xsmooth(2,:)', '-m', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
%plot(t', xfilt(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', - ones(N,1), 'b-');
% plot(t', ones(N,1), 'b-');
% xlabel('Time', 14);
% ylabel('Position', 'Fontsize', 14);
% set(gca,'FontSize',14);
axis([h_min, h_max, v_min, v_max]);
res_x = zeros(n, N);
xk = zeros(n, 1);
for k = 1 : N
xkm = xk;
xk = xOut(:, k);
gk = feval(g_fun, k, xkm);
res_x(:,k) = xk - gk;
end
% hold off
% figure(2);
% clf
% hold on
% plot(t', x_true(2,:)', '-k', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
% plot(t', xOut(2,:), '-m', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
% hold off
%
% figure(3);
% clf
% hold on
%
% plot(t', res_x(1,:)', '-m', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
%
% hold off
%
% figure(4)
% clf
% hold on
% plot(t', res_x(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
% hold off
end
[x_out, u_out, info] = ckbs_nonlinear(
f_fun, g_fun, h_fun, ...
max_itr, epsilon, x_in, z, qinv, rinv, level)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
[ z_k - h_k ( x_k ) ]^\R{T} * R_k^{-1} * [ z_k - h_k ( x_k ) ]
\\
& + &
\frac{1}{2}
[ x_k - g_k ( x_{k-1} ) ]^\R{T} * Q_k^{-1} * [ x_k - g_k ( x_{k-1} ) ]
\\
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Note that
g_1 ( x_0 )
is the initial state estimate
and
Q_1
is the corresponding covariance.
\[
\begin{array}{rll}
{\rm minimize}
& S( x_1 , \ldots , x_N )
& {\rm w.r.t.} \; x_1 \in \B{R}^n , \ldots , x_N \in \B{R}^n
\\
{\rm subject \; to}
& f_k(x_k) \leq 0
& {\rm for} \; k = 1 , \ldots , N
\end{array}
\]
( x_1 , \ldots , x_N )
is considered a solution
if there is a Lagrange multiplier sequence
( u_1 , \ldots , u_N )
such that the following conditions are satisfied for
i = 1 , \ldots , \ell
and
k = 1 , \ldots , N
:
\[
\begin{array}{rcl}
f_k ( x_k ) & \leq & \varepsilon \\
0 & \leq & u_k \\
| u_k^\R{T} * \partial_k f_k ( x_k ) + \partial_k S( x_1 , \ldots , x_N ) |
& \leq & \varepsilon \\
| f_k ( x_k )_i | * ( u_k )_i & \leq & \varepsilon
\end{array}
\]
Here the notation
\partial_k
is used for the partial derivative of
with respect to
x_k
and the notation
(u_k)_i
denotes the i-th component of
u_k
.
ckbs_nonlinear
argument
f_fun
is a function that supports both of the
following syntaxes
[fk] = feval(f_fun, k, xk)
[fk, Fk] = feval(f_fun, k, xk)
[fk, Fk] = f_fun(k, xk)
n = size(xk, 1);
fk = -1;
Fk = zeros(1, n);
return
end
f_fun
argument
k
is an integer with
1 \leq k \leq N
.
f_fun
argument
xk
is an column vector with
length
n
. It specifies the state vector at time index
k
\[
xk = x_k
\]
.
f_fun
result
fk
is an column vector with
length
\ell
and
\[
fk = f_k ( x_k )
\]
f_fun
result
Fk
is present in the syntax,
it is the
\ell \times n
matrix given by
and
\[
Fk = \partial_k f_k ( x_k )
\]
ckbs_nonlinear
argument
g_fun
is a function that supports both of
the following syntaxes
[gk] = feval(g_fun, k, xk1)
[gk, Gk] = feval(g_fun, k, xk1)
g_fun
argument
k
is an integer with
1 \leq k \leq N
.
The case
k = 1
serves to specify the initial state estimate.
g_fun
argument
xk1
is an column vector with
length
n
. It specifies the state vector at time index
k
\[
xk1 = x_{k-1}
\]
.
In the case
k = 1
, the value of
xk1
does not matter.
g_fun
result
gk
is an column vector with
length
n
and
\[
gk = g_k ( x_{k-1} )
\]
In the case
k = 1
, the value of
gk
is the initial state
estimate at time index
k
.
g_fun
result
Gk
is present in the syntax,
it is the
n \times n
matrix given by
and
\[
Gk = \partial_{k-1} g_k ( x_{k-1} )
\]
In the case
k = 1
, the value of
Gk
is the zero matrix;
i.e.,
g_k
does not depend on the value of
x_{k-1}
.
ckbs_nonlinear
argument
h_fun
is a function that supports both of the
following syntaxes
[hk] = feval(h_fun, k, xk)
[hk, Hk] = feval(h_fun, k, xk)
h_fun
argument
k
is an integer with
1 \leq k \leq N
.
h_fun
argument
xk
is an column vector with
length
n
. It specifies the state vector at time index
k
\[
xk = x_k
\]
.
h_fun
result
hk
is an column vector with
length
m
and
\[
hk = h_k ( x_k )
\]
h_fun
result
Hk
is present in the syntax,
it is the
m \times n
matrix given by
and
\[
Hk = \partial_k h_k ( x_k )
\]
max_itr
specifies the maximum number of
iterations of the algorithm to execute. It must be greater than or
equal to zero. Note that if it is zero, the first row of the
4.r: info
return value will still be computed.
This can be useful for deciding what is a good value for the argument
4.j: epsilon
.
ckbs_nonlinear
argument
epsilon
is a positive scalar.
It specifies the convergence
criteria value; i.e.,
\[
\varepsilon = epsilon
\]
In some instances, ckbs_nonlinear
will return after printing
a warning without convergence; see 4.r: info
.
ckbs_nonlinear
argument
x_in
is a two dimensional array with size
n \times N
.
It specifies a sequence of state values; i.e.,
for
k = 1 , \ldots , N
\[
x\_in (:, k) = x_k
\]
The closer the initial state sequence is to the solution
the faster, and more likely, the ckbs_nonlinear
will converge.
The initial state sequence need not be feasible; i.e.
it is not necessary that
\[
f_k ( x_k ) \leq 0
\]
for all
k
.
ckbs_nonlinear
argument
z
is a two dimensional array
with size
m \times N
.
It specifies the sequence of measurement vectors; i.e.,
for
k = 1 , \ldots , N
\[
z(:, k) = z_k
\]
ckbs_nonlinear
argument
qinv
is a three dimensional array
with size
n \times n \times N
.
It specifies the inverse of the variance of the measurement noise; i.e.,
for
k = 1 , \ldots , N
\[
qinv(:,:, k) = Q_k^{-1}
\]
In the case
k = 1
, the value of
Q_k
is the variance
of the initial state estimate (see 4.g: g_fun
.
ckbs_nonlinear
argument
rinv
is a three dimensional array,
with size
m \times m \times N
.
it specifies the inverse of the variance of the transition noise; i.e.,
for
k = 1 , \ldots , N
\[
rinv(:,:, k) = R_k^{-1}
\]
It is ok to signify a missing data value by setting the corresponding
row and column of
rinv
to zero. This also enables use
to interpolate the state vector
x_k
to points where
there are no measurements.
level
specifies the level of tracing to do.
If
level == 0
, no tracing is done.
If
level >= 1
, the row index
q
and the correspond row of
info
are printed as soon as soon as they are computed.
If
level >= 2
, a check of the derivative calculations
is printed before the calculations start. In this case, control will
return to the keyboard so that you can print values, continue,
or abort the calculation.
x_out
is a two dimensional array with size
n \times N
.
( x_1 , \ldots , x_N )
.
It contains an approximation for the optimal sequence; i.e.,
for
k = 1 , \ldots , N
\[
x\_out(:, k) = x_k
\]
u_out
is a two dimensional array with size %
\ell \times N
.
It contains the Lagrange multiplier sequence
corresponding to
x_out
; i.e.,
for
k = 1 , \ldots , N
\[
u\_out(:, k) = u_k
\]
The pair
x_out
,
u_out
satisfy the
4.e: first order conditions
.
info
is a matrix with each row corresponding
to an iteration of the algorithm.
Note that ckbs_nonlinear
has satisfied the convergence condition if
and only if
all( info(end, 1:3) <= epsilon )
We use
\{ x_k^q \}
to denote the state vector sequence at the
at the end of iteration
q-1
and
\{ u_k^q \}
for the dual vector sequence (
u_k^q \geq 0
).
max_f(q) = info(q, 1)
is a bound on the constraint functions
at the end of iteration
q-1
. To be specific
\[
max\_f(q) = \max_{i, k} [ f_k ( x_k^q )_i ]
\]
where the maximum is with respect to
i = 1 , \ldots , \ell
and
k = 1 , \ldots , N
.
Convergence requires this value to be less than or equal
epsilon
.
max_grad(q) = info(q, 2)
is a bound on the partial of the
Lagrangian with respect to
x
at the end of iteration
q-1
. To be specific
\[
max\_grad(q) = \max_{j, k} \left[ \left|
(u_k^q)^\R{T} * \partial_k f_k(x_k^q) + \partial_k S(x_1^q, \ldots ,x_N^q)
\right| \right]_j
\]
where the maximum is with respect to
j = 1 , \ldots , n
and
k = 1 , \ldots , N
.
This value should converge to zero.
max_fu = info(q, 3)
is a bound on
the complementarity conditions
at the end of iteration
q-1
. To be specific
\[
max\_fu(q) = \max_{i, k} [ | f_k ( x_k )_i | * ( u_k )_i ]
\]
where the maximum is with respect to
i = 1 , \ldots , \ell
and
k = 1 , \ldots , N
.
This value should converge to be less than or equal zero.
obj_cur(q) = info(q, 4)
is the value of the objective function
at the end of
q-1
; i.e.,
\[
obj\_cur(q) = S( x_1^q , \ldots , x_k^q )
\]
(Note that
info(1, 4)%
is the value of the objective
corresponding to
x_in
.
lambda(q) = info(q, 5)
is the line search step size used during
iteration
q-1
of ckbs_nonlinear
.
If the problem is nearly affine (if the affine approximate is accurate)
this will be one.
Much small step sizes indicate highly non-affine problems
(with the exception that
info(1, 5)
is always zero).
lam_aff(q) = info(q, 6)
is the minimum line search step size
while solving the affine sub-problem that corresponds to
iteration
q-1
of ckbs_nonlinear
.
If the sub-problem solution is working well,
this will be one.
Much small step sizes indicate the sub-problem solution is not working well.
\[
S( x_1 , \ldots , x_k )
+ \alpha \sum_{k=1}^N \sum_{i=1}^\ell \max [ f_k ( x_k )_i , 0 ]
\]
is used to determine the line search step size.
The value
info(q, 7)
is the penalty parameter
\alpha
during iteration
q-1
of ckbs_nonlinear
.
ckbs_nonlinear
.
It returns true if ckbs_nonlinear
passes the tests
and false otherwise.
constraint = 'no_constraint'
to the routine
4.2: sine_wave_ok.m
runs an unconstrained example / test.
constraint = 'box_constraint'
to the routine
4.2: sine_wave_ok.m
runs a example / test
with upper and lower constraints.
constraint = 'sine_constraint'
to the routine
4.2: sine_wave_ok.m
runs a example / test with a nonlinear constraint.
ckbs_nonlinear
estimating the position of a
Van Der Pol oscillator (an oscillator with non-linear dynamics).
It returns true if ckbs_nonlinear
passes the tests
and false otherwise.
[ok] = get_started_ok(draw_plot)
example
directory,
start octave
(or matlab
), and then
execute the commands
test_path
get_started_ok(true)
draw_plot
is no present or false, no plot is drawn.
ok
is true, the test passed,
otherwise the test failed.
\B{R}^n
and the sequence of state vector values is simulated with
x_{j,k} = 1
for
j = 1 , \ldots , n
and
k = 1 , \ldots N
.
\[
x_k = x_{k-1} + w_k
\]
where
w_k \sim \B{N} ( 0 , Q_k )
and the variance
Q_k = I_n
(and
I_n
is the
n \times n
identity matrix.
The routine 4.4.5: persist_g.m
calculates the mean for
x_k
given
x_{k-1}
.
Note that the initial state estimate is returned by this routine as
\hat{x} = g_1 (x)
and has
\hat{x}_{j,k} = 1
for
j = 1 , \ldots , n
.
Also not that the corresponding variance
Q_1 = I_n
.
\[
z_k = x_k + v_k
\]
where
v_k \sim \B{N} ( 0 , R_k )
and the variance
R_k = I_n
.
The routine 4.4.2: direct_h.m
calculates the mean for
z_k
,
given the value of
x_k
.
\[
0.5 \leq x_{k,j} \leq 1.5
\]
for
j = 1 , \ldots , n
and
k = 1 , \ldots , N
.
The routine 4.4.1: box_f.m
represents these constraints in the form
f_k ( x_k ) \leq 0
as expected by 4: ckbs_nonlinear
.
4.4.5: persist_g.m | ckbs_nonlinear: Example of Persistence Transition Function |
4.4.2: direct_h.m | ckbs_nonlinear: Example Direct Measurement Model |
4.4.1: box_f.m | ckbs_nonlinear: Example of Box Constraints |
function [ok] = get_started_ok(draw_plot)
if nargin < 1
draw_plot = false;
end
% --------------------------------------------------------------------------
max_itr = 20; % maximum number of iterations
epsilon = 1e-5; % convergence criteria
N = 40; % number of time points
n = 1; % number of state vector components per time point
sigma_r = 1; % variance of measurement noise
sigma_q = 1; % variance of transition noise
sigma_e = 1; % variance of initial state estimate
initial = ones(n); % estimate for initial state value
randn('seed', 4321)
% ---------------------------------------------------------------------------
% Level of tracing during optimization
if draw_plot
level = 1;
else
level = 0;
end
m = n; % number of measurements per time point
ell = 2 * n; % number of constraints
index = 1 : n; % index vector
% ---------------------------------------------------------------------------
% information used by box_f
params.box_f_lower = 0.5 * ones(n, 1);
params.box_f_upper = 1.5 * ones(n, 1);
params.box_f_index = index;
% ---------------------------------------------------------------------------
% global information used by persist_g
params.persist_g_initial = initial;
% ---------------------------------------------------------------------------
% global information used by direct_h
%global direct_h_info
params.direct_h_index = index;
h_fun = @(k,x) direct_h(k,x,params);
% ---------------------------------------------------------------------------
% simulated true trajectory
x_true = ones(n, N);
% initial vector during optimization process
x_in = 0.5 * ones(n, N);
% measurement variances
rinv = zeros(m, m, N);
qinv = zeros(n, n, N);
z = zeros(m, N);
for k = 1 : N
xk = x_true(:, k);
ek = randn(m, 1);
hk = h_fun(k, xk);
z(:, k) = hk + ek;
qinv(:,:, k) = eye(n) / (sigma_q * sigma_q);
rinv(:,:, k) = eye(n) / (sigma_r * sigma_r);
end
qinv(:,:,1) = eye(n) / (sigma_e * sigma_e);
% ----------------------------------------------------------------------
% call back functions
f_fun = @(k,xk) box_f(k,xk,params);
g_fun = @(k,xk) persist_g(k,xk,params);
% ----------------------------------------------------------------------
% call optimizer
[x_out, u_out, info] = ckbs_nonlinear( ...
f_fun, ...
g_fun, ...
h_fun, ...
max_itr, ...
epsilon, ...
x_in, ...
z, ...
qinv, ...
rinv, ...
level ...
);
% ----------------------------------------------------------------------
ok = size(info, 1) <= max_itr;
%
% Set up affine problem corresponding to x_out (for a change in x).
% Check constraint feasibility and complementarity.
f_out = zeros(ell, N);
g_out = zeros(n, N);
h_out = zeros(m, N);
df_out = zeros(ell, n, N);
dg_out = zeros(n, n, N);
dh_out = zeros(m, n, N);
xk1 = zeros(n, 1);
for k = 1 : N
xk = x_out(:, k);
uk = u_out(:, k);
[fk, Fk] = f_fun(k, xk);
[gk, Gk] = g_fun(k, xk1);
[hk, Hk] = h_fun(k, xk);
%
ok = ok & all( fk <= epsilon); % feasibility
ok = ok & all( abs(fk) .* uk <= epsilon ); % complementarity
%
f_out(:, k) = fk;
g_out(:, k) = gk - xk;
h_out(:, k) = hk;
df_out(:,:, k) = Fk;
dg_out(:,:, k) = Gk;
dh_out(:,:, k) = Hk;
xk1 = xk;
end
ok = ok & all( all( u_out >= 0 ) );
%
% Compute gradient of unconstrained affine problem at dx equal to zero
% and then check the gradient of the Lagrangian
dx_out = zeros(n, N);
grad = ckbs_sumsq_grad(dx_out, z, g_out, h_out, dg_out, dh_out, qinv, rinv);
for k = 1 : N
uk = u_out(:, k);
Fk = df_out(:,:, k);
dk = grad(:, k);
%
ok = ok & all( abs(Fk' * uk + dk) <= epsilon );
end
% ----------------------------------------------------------------------
if draw_plot
figure(1);
clf
hold on
time = 1 : N;
plot(time, x_true(1,:), 'b-');
plot(time, x_out(1,:), 'g-');
plot(time, z(1,:), 'ko');
plot(time, 0.5*ones(1,N), 'r-');
plot(time, 1.5*ones(1,N), 'r-');
title('Position: blue=truth, green=estimate, red=constraint, o=data');
axis( [ 0 , N , -1, 3 ] )
hold off
return
end
[
ok] = sine_wave_ok(
constraint,
draw_plot)
value | meaning |
no_constraint | solve unconstrained problem |
box_constraint | use box constraints |
sine_constraint | use a sine wave constraint |
draw_plot
is false, no plot is drawn.
If this argument is true, a plot is drawn showing the results.
x1 (t) = 1
| first component of velocity |
x2 (t) = t
| first component of position |
x3 (t) = \cos(t)
| second component of velocity |
x4 (t) = \sin(t)
| second component of position |
\[
\begin{array}{rcl}
x_{1,k} & = & x_{1,k-1} + x_{2,k-1} \Delta t + w_{1,k}
\\
x_{2,k} & = & x_{2,k-1} + w_{2,k}
\\
x_{3,k} & = & x_{3,k-1} + x_{4,k-1} \Delta t + w_{3,k}
\\
x_{4,k} & = & x_{4,k-1} + w_{4,k}
\end{array}
\]
where
\Delta t = 2 \pi / N
is the time between measurements,
N = 50
is the number of measurement times,
w_k \sim \B{N}( 0 , Q_k )
,
and the transition variance is given by
\[
Q_k = \left( \begin{array}{cccc}
\Delta t & \Delta t^2 / 2 & 0 & 0 \\
\Delta t^2 / 2 & \Delta t^3 / 3 & 0 & 0 \\
0 & 0 & \Delta t & \Delta t^2 / 2 \\
0 & 0 & \Delta t^2 / 2 & \Delta t^3 / 3
\end{array} \right)
\]
The routine 4.4.6: pos_vel_g.m
calculates the mean for
x_k
given
x_{k-1}
.
\hat{x} = g_1 ( x_0 )
as the simulated state value at the
first measurement point; i.e.,
\[
\hat{x} = [ 1 , \Delta t , \cos ( \Delta t ) , \sin ( \Delta t ) ]^\R{T}
\]
The variance of the initial state estimate is
Q_1 = 10^4 I_4
where
I_4
is the four by four
identity matrix.
a = (0, -1.5)^\R{T}
and
b = ( 2 \pi , -1.5)
; i.e.,
\[
\begin{array}{rcl}
z_{1,k} & = & \sqrt{ ( x_{2,k} - a_1 )^2 + ( x_{4,k} - a_2 )^2 } + v_{1,k}
\\
z_{2,k} & = & \sqrt{ ( x_{2,k} - b_1 )^2 + ( x_{4,k} - b_2 )^2 } + v_{2,k}
\end{array}
\]
where
v_k \sim \B{N} ( 0 , R_k )
and
the measurement variance
R_k = \R{diag}( .25 , .25 )
.
The routine 4.4.3: distance_h.m
calculates the mean for
z_k
,
given the value of
x_k
.
\hat{k}
is the time index where
x_{4,k}
is maximal,
\[
\begin{array}{rcl}
z_{1,k} & = &
\sqrt{ ( x_{2,k} - a_1 )^2 + ( x_{4,k} + .5 - a_2 )^2 }
\\
z_{2,k} & = &
\sqrt{ ( x_{2,k} - b_1 )^2 + ( x_{4,k} + .5 - b_2 )^2 }
\end{array}
\]
This increases the probability that the upper constraint will be
active at the optimal solution.
constraint
argument:
constraint == 'no_constraint'
, the function 4.4.4: no_f.m
is used and the unconstrained problem is solved.
constraint == 'box_constraint'
, the function 4.4.1: box_f.m
is used to represent the following constraints:
for
k = 1 , \ldots , N
,
\[
-1 \leq x_{4,k} \leq +1
\]
constraint == 'sine_constraint'
, the function 4.4.7: sine_f.m
is used to represent the following constraints:
for
k = 1 , \ldots , N
,
\[
x_{4,k} \leq \sin( x_{2,k} ) + .1
\]
4.4.6: pos_vel_g.m | ckbs_nonlinear: Example Position and Velocity Transition Model |
4.4.3: distance_h.m | ckbs_nonlinear: Example of Distance Measurement Model |
4.4.4: no_f.m | ckbs_nonlinear: Example of No Constraint |
4.4.1: box_f.m | ckbs_nonlinear: Example of Box Constraints |
4.4.7: sine_f.m | ckbs_nonlinear: Example of Nonlinear Constraint |
function [ok] = sine_wave_ok(constraint, draw_plot)
% --------------------------------------------------------
% Simulation problem parameters
max_itr = 100; % maximum number of iterations
epsilon = 1e-5; % convergence criteria
N = 50; % number of measurement time points
dt = 2 * pi / N; % time between measurement points
sigma_r = .5; % standard deviation of measurement noise
sigma_q = 1; % square root of multiplier for transition variance
sigma_e = 100; % standard deviation
delta_box = 0.; % distance from truth to box constraint
delta_sine = .1; % distance from truth to sine constraint
h_min = 0; % minimum horizontal value in plots
h_max = 7; % maximum horizontal value in plots
v_min = -1.5; % minimum vertical value in plots
v_max = +1.5; % maximum vertical value in plots
a = [ 0 ; -1.5]; % position that range one measures from
b = [ 2*pi ; -1.5]; % position that range two measures to
randn('seed', 1234); % Random number generator seed
% ---------------------------------------------------------
% level of tracing during the optimization
if draw_plot
level = 1;
else
level = 0;
end
% ---------------------------------------------------------
% simulate the true trajectory and measurement noise
t = (1 : N) * dt; % time
x_true = [ ones(1, N) ; t ; cos(t) ; sin(t) ];
initial = x_true(:,1);
% ---------------------------------------------------------
% information used by box_f
params.box_f_lower = - 1 - delta_box;
params.box_f_upper = + 1 + delta_box;
params.box_f_index = 4;
% ---------------------------------------------------------
% information used by sine_f
params.sine_f_index = [ 2 ; 4 ];
params.sine_f_offset = [ 0 ; delta_sine ];
% ---------------------------------------------------------
% information used by pos_vel_g
params.pos_vel_g_dt = dt;
params.pos_vel_g_initial = initial;
g_fun = @(k,x) pos_vel_g(k,x,params);
% ---------------------------------------------------------
% information used by distance_h
params.distance_h_index = [ 2 ; 4 ];
params.distance_h_position = [a , b ];
h_fun = @(k,x) distance_h(k,x, params);
% ---------------------------------------------------------
% problem dimensions
m = 2; % number of measurements per time point
n = 4; % number of state vector components per time point
ell = 0; % number of constraint (reset to proper value below)
% ---------------------------------------------------------
% simulate the measurements
[c,k_max] = max(x_true(4,:)); % index where second position component is max
rinv = zeros(m, m, N); % inverse covariance of measurement noise
z = zeros(m, N); % measurement values
randn('state',sum(100*clock)); % random input
for k = 1 : N
x_k = x_true(:, k);
if k == k_max
% special case to increase chance that upper constraint is active
x_k(2) = x_k(2) + .5;
h_k = h_fun(k, x_k);
z(:, k) = h_k;
else
% simulation according to measurement model
h_k = h_fun(k, x_k);
z(:, k) = h_k + sigma_r * randn(2, 1);
end
rinv(:,:, k) = eye(m) / (sigma_r * sigma_r);
end
%fid = fopen('z4.dat', 'wt');
%fprintf(fid, '%6.6f %6.6f\n', z);
z = load('z3.dat')';
% ---------------------------------------------------------
% inverse transition variance
qk = sigma_q * sigma_q * [ ...
dt , dt^2 / 2 , 0 , 0
dt^2 / 2 , dt^3 / 3 , 0 , 0
0 , 0 , dt , dt^2 / 2
0 , 0 , dt^2 / 2 , dt^3 / 3
];
qinvk = inv(qk);
qinv = zeros(n, n, N);
for k = 2 : N
qinv(:,:, k) = qinvk;
end
%
% inverse initial estimate covariance
qinv(:,:,1) = eye(n) * sigma_e * sigma_e;
% ----------------------------------------------------------
% initialize x vector at position and velocity zero
x_in = zeros(n, N);
% ----------------------------------------------------------------------
%
is_no_f = false;
is_box_f = false;
is_sine_f = false;
switch constraint
case {'no_constraint'}
ell = 1;
f_fun = @ no_f;
is_no_f = true;
case {'box_constraint'}
ell = 2;
f_fun = @(k,xk) box_f(k,xk,params);
is_box_f = true;
case {'sine_constraint'}
ell = 1;
f_fun = @(k,xk) sine_f(k,xk,params);
is_sine_f = true;
otherwise
constraint = constraint
error('sine_wave_ok: invalid value for constraint')
end
% ----------------------------------------------------------------------
[x_out, u_out, info] = ckbs_nonlinear( ...
f_fun, ...
g_fun, ...
h_fun, ...
max_itr, ...
epsilon, ...
x_in, ...
z, ...
qinv, ...
rinv, ...
level ...
);
% ----------------------------------------------------------------------
ok = true;
ok = ok & (size(info,1) <= max_itr);
%
% Set up affine problem (for change in x) corresponding to x_out.
% Check constraint feasibility and complementarity.
f_out = zeros(ell, N);
g_out = zeros(n, N);
h_out = zeros(m, N);
df_out = zeros(ell, n, N);
dg_out = zeros(n, n, N);
dh_out = zeros(m, n, N);
xk1 = zeros(n, 1);
for k = 1 : N
xk = x_out(:, k);
uk = u_out(:, k);
[fk, Fk] = f_fun(k, xk);
[gk, Gk] = g_fun(k, xk1);
[hk, Hk] = h_fun(k, xk);
%
ok = ok & all( fk <= epsilon ); % feasibility
ok = ok & all( abs(fk) .* uk <= epsilon ); % complementarity
%
f_out(:, k) = fk;
g_out(:, k) = gk - xk;
h_out(:, k) = hk;
df_out(:,:, k) = Fk;
dg_out(:,:, k) = Gk;
dh_out(:,:, k) = Hk;
%
xk1 = xk;
end
ok = ok & all( all( u_out >= 0 ) );
%
% Compute gradient of unconstrained affine problem at zero
% and check gradient of Lagrangian
dx_out = zeros(n, N);
d_out = ckbs_sumsq_grad(dx_out, z, g_out, h_out, dg_out, dh_out, qinv, rinv);
for k = 1 : N
uk = u_out(:, k);
Fk = df_out(:,:, k);
dk = d_out(:, k);
%
ok = ok & (max ( abs( Fk' * uk + dk ) ) <= 1e2*epsilon);
end
%
if draw_plot
figure(1)
clf
hold on
plot( x_true(2,:) , x_true(4,:) , 'b-' );
plot( x_out(2,:) , x_out(4,:) , 'g-' );
if is_no_f
title('blue=truth, green=estimate, no constraint')
end
if is_box_f
title('blue=truth, green=estimate, red=linear constraint')
plot( t , + ones(1, N) + delta_box , 'r-');
plot( t , - ones(1, N) - delta_box , 'r-');
end
if is_sine_f
title('blue=truth, green=estimate, red=nonlinear constraint')
plot( x_out(2,:) , sin( x_out(2,:) ) + delta_sine , 'r-');
end
axis( [ h_min, h_max, v_min, v_max] );
return
end
[ok] = vanderpol_ok(draw_plot)
draw_plot
is not present or false, no plot is drawn.
ok
is true, the test passed,
otherwise the test failed.
x1 (t)
and
x2 (t)
to denote the oscillator position and velocity as a function of time.
The deterministic ordinary differential equation for the
Van der Pol oscillator is
\[
\begin{array}{rcl}
x1 '(t) & = & x2 (t)
\\
x2 '(t) & = & \mu [ 1 - x1(t)^2 ] x2 (t) - x1(t)
\end{array}
\]
The simulated state vector values are the solution to this ODE
with nonlinear parameter
\mu = 2
and initial conditions
x1 (0) = 0
,
x2 (0) = -5.
.
The routine 4.3.1: vanderpol_sim
calculates these values.
\Delta t = .1
. In other words,
given
x_{k-1}
, the value of the state at time
t_{k-1}
,
we model
x_k
, the state value at time
t_k = t_{k-1} + \Delta t
,
by
\[
\begin{array}{rcl}
x_{1,k} & = & x_{1,k-1} + x_{2,k-1} \Delta t + w_{1,k}
\\
x_{2,k} & = & x_{2,k-1} +
[ \mu ( 1 - x_{1,k-1}^2 ) x_{2,k-1} - x_{1,k-1} ] \Delta t
+ w_{2,k}
\end{array}
\]
where
w_k \sim \B{N}( 0 , Q_k )
and the variance
Q_k = \R{diag} ( 0.01, 0.01 )
.
The routine 4.3.2: vanderpol_g.m
calculates the mean for
x_k
given
x_{k-1}
.
Note that the initial state estimate is calculated by this routine as
\hat{x} = g_1 ( x_0 )
and has
\hat{x}_1 = \hat{x}_2 = 0
and
Q_1 = \R{diag} ( 100. , 100. )
.
\[
\begin{array}{rcl}
z_{1,k} & = & x_{1,k} + v_{1,k}
\end{array}
\]
where
v_k \sim \B{N}( 0 , R_k )
and
the measurement variance
R_k = \R{diag} ( 0.01, 0.01 )
.
The routine 4.4.2: direct_h.m
calculates the mean for
z_k
,
given the value of
x_k
; i.e.
x_{1,k}
.
\[
f_k ( x_k ) = - 1
\]
The routine 4.4.4: no_f.m
is used for these calculations.
4.3.1: vanderpol_sim | Van der Pol Oscillator Simulation (No Noise) |
4.3.2: vanderpol_g.m | ckbs_nonlinear: Vanderpol Transition Model Mean Example |
4.4.2: direct_h.m | ckbs_nonlinear: Example Direct Measurement Model |
4.4.4: no_f.m | ckbs_nonlinear: Example of No Constraint |
function [ok] = vanderpol_ok(draw_plot)
if nargin < 1
draw_plot = false;
end
% ---------------------------------------------------------------
% Simulation problem parameters
max_itr = 20; % Maximum number of optimizer iterations
epsilon = 1e-4; % Convergence criteria
N = 41; % Number of measurement points (N > 1)
T = 4.; % Total time
ell = 1; % Number of constraints (never active)
n = 2; % Number of components in the state vector (n = 2)
m = 1; % Number of measurements at each time point (m <= n)
xi = [ 0. , -5.]'; % True initial value for the state at time zero
x_in = zeros(n, N); % Initial sequence where optimization begins
initial = [ 0. , -5.]'; % Estimate of initial state (at time index one)
mu = 2.0; % ODE nonlinearity parameter
sigma_r = 1.; % Standard deviation of measurement noise
sigma_q = .1; % Standard deviation of the transition noise
sigma_e = 10; % Standard deviation of initial state estimate
randn('seed', 4321); % Random number generator seed
% -------------------------------------------------------------------
% Level of tracing during optimization
if draw_plot
level = 1;
else
level = 0;
end
% spacing between time points
dt = T / (N - 1);
% -------------------------------------------------------------------
% information used by vanderpol_g
params.vanderpol_g_initial = initial;
params.vanderpol_g_dt = dt;
params.vanderpol_g_mu = mu;
% -------------------------------------------------------------------
% global information used by direct_h
%global direct_h_info
params.direct_h_index = 1;
% ---------------------------------------------------------------
% Rest of the information required by ckbs_nonlinear
%
% Step size for fourth order Runge-Kutta method in vanderpol_sim
% is the same as time between measurement points
step = dt;
% Simulate the true values for the state vector
x_true = vanderpol_sim(mu, xi, N, step);
time = linspace(0., T, N);
% Simulate the measurement values
z = x_true(1:m,:) + sigma_r * randn(m, N);
% Inverse covariance of the measurement and transition noise
rinv = zeros(m, m, N);
qinv = zeros(n, n, N);
for k = 1 : N
rinv(:, :, k) = eye(m, m) / sigma_r^2;
qinv(:, :, k) = eye(n, n) / sigma_q^2;
end
qinv(:, :, 1) = eye(n, n) / sigma_e^2;
% ---------------------------------------------------------------
% call back functions
f_fun = @ no_f;
g_fun = @ (k,x) vanderpol_g(k,x,params);
h_fun = @(k,x) direct_h(k,x,params);
% ---------------------------------------------------------------
% call the optimizer
[ x_out , u_out, info ] = ckbs_nonlinear( ...
f_fun , ...
g_fun , ...
h_fun , ...
max_itr , ...
epsilon , ...
x_in , ...
z , ...
qinv , ...
rinv , ...
level ...
);
% ----------------------------------------------------------------------
% Check that x_out is optimal
% (this is an unconstrained case, so there is no need to check f).
ok = size(info, 1) <= max_itr;
g_out = zeros(n, N);
h_out = zeros(m, N);
dg_out = zeros(n, n, N);
dh_out = zeros(m, n, N);
xk = zeros(n, 1);
for k = 1 : N
xkm = xk;
%
xk = x_out(:, k);
uk = u_out(:, k);
[gk, Gk] = g_fun(k, xkm);
[hk, Hk] = h_fun(k, xk);
%
g_out(:, k) = gk - xk;
h_out(:, k) = hk;
dg_out(:,:, k) = Gk;
dh_out(:,:, k) = Hk;
end
dx = zeros(n, N);
grad = ckbs_sumsq_grad(dx, z, g_out, h_out, dg_out, dh_out, qinv, rinv);
ok = max( max( abs(grad) ) ) < epsilon;
% ----------------------------------------------------------------------
if draw_plot
figure(1);
clf
hold on
plot(time, x_true(1,:), 'b-');
plot(time, x_out(1,:), 'g-');
plot(time, z(1,:), 'ko');
title('Position: blue=truth, green=estimate, o=data');
hold off
return
end
end
[x_out] = vanderpol_sim(
mu, xi, n_out, step)
x_1 (t)
and
x_2 (t)
to denote the oscillator position and velocity as a function of time.
The ordinary differential equation for the
Van der Pol oscillator with no noise satisfies the differential equation
\[
\begin{array}{rcl}
x_1 '(t) & = & x_2 (t)
\\
x_2 '(t) & = & \mu [ 1 - x_1(t)^2 ] x_2 (t) - x_1(t)
\end{array}
\]
\mu
is the
differential equation above.
x(t) \in \B{R}^2
.
To be specific,
x_1 (0)
is equal to
xi(1)
and
x_2(0)
is equal to
xi(2)
.
n_out
that
contains the approximation solution to the ODE.
To be specific, for
k = 1 , ... , n_out
,
x_out(i,k)
is an approximation for
x_i [ (k-1) \Delta t ]
.
step
is used
to approximate the solution of the ODE.
vanderpol_sim
.
It returns true, if the test passes, and false otherwise.
function [ok] = vanderpol_sim_ok()
mu = 1.;
xi = [ 2 ; 0 ];
n_out = 21;
step = .1;
x_out = vanderpol_sim(mu, xi, n_out, step);
t_sol = [ 0.0 , 0.5 , 1.0 , 1.5 , 2.0 ];
x1_sol = [ 2.0000 , 1.8377 , 1.5081 , 1.0409, 0.3233 ];
ok = 1;
for k = 1 : 5
ok = ok & abs( x_out(1, (k-1)*5 + 1) - x1_sol(k) ) < 1e-4;
end
return
end
[gk] = vanderpol_g(k, xk1, params)
[gk, Gk] = vanderpol_g(k, xk1, params)
initial = params.vanderpol_g_info.initial
dt = params.vanderpol_g_info.dt
mu = params.vanderpol_g_info.mu
gk
as the mean of the state at time index
k
given
xk1
is its value at time index
k-1
.
This mean is Euler's discrete approximation for the solution of the
Van Der Pol oscillator ODE; i.e.,
\[
\begin{array}{rcl}
x_1 '(t) & = & x_2 (t)
\\
x_2 '(t) & = & \mu [ 1 - x_1(t)^2 ] x_2 (t) - x_1(t)
\end{array}
\]
where the initial state is
xk1
and the time step is
dt
.
\Delta t
below).
\mu
in the Van Der Pol ODE.
k-1
.
k == 1
,
the return value
gk
is two element column vector
equal to
initial
.
Otherwise,
gk
is the two element column vector given by
\[
\begin{array}{rcl}
g_{1,k} & = & x_{1,k-1} + x_{2,k-1} \Delta t
\\
g_{2,k} & = & x_{2,k-1}
+ [ \mu ( 1 - x_{1,k-1}^2 ) x_{2,k-1} - x_{1,k-1} ] \Delta t
\end{array}
\]
where
( g_{1,k} , g_{2,k} )^\R{T}
is
gk
and
( x_{1,k-1} , x_{2,k-1} )^\R{T}
is
xk1
.
Gk
is an
n x n
matrix equal to the
Jacobian of
gk
w.r.t
xk1
.
function [gk, Gk] = vanderpol_g(k, xk1, params)
initial = params.vanderpol_g_initial;
dt = params.vanderpol_g_dt;
mu = params.vanderpol_g_mu;
n = 2;
if (size(xk1,1) ~= n) | (size(xk1,2) ~= 1)
size_xk1_1 = size(xk1, 1)
size_xk1_2 = size(xk1, 2)
error('vanderpol_g: xk1 not a column vector with two elements')
end
if (size(initial,1) ~= n) | (size(initial,2) ~= 1)
size_initial_1 = size(initial, 1)
size_initial_2 = size(initial, 2)
error('vanderpol_g: initial not a column vector with two elements')
end
if (size(dt,1)*size(dt,2) ~= 1) | (size(mu,1)*size(mu,2) ~= 1)
size_dt_1 = size(dt, 1)
size_dt_2 = size(dt, 2)
size_mu_1 = size(mu, 1)
size_mu_2 = size(mu, 2)
error('vanderpol_g: dt or mu is not a scalar')
end
if k == 1
gk = initial;
Gk = zeros(n, n);
else
gk = zeros(n, 1);
gk(1) = xk1(1) + xk1(2) * dt;
gk(2) = xk1(2) + (mu * (1 - xk1(1)*xk1(1)) * xk1(2) - xk1(1)) * dt;
%
Gk = zeros(n, n);
Gk(1,1) = 1;
Gk(1,2) = dt;
Gk(2,1) = (mu * (- 2 * xk1(1)) * xk1(2) - 1) * dt;
Gk(2,2) = 1 + mu * (1 - xk1(1)*xk1(1)) * dt;
end
return
end
box_f.m: 4.4.1 | ckbs_nonlinear: Example of Box Constraints |
direct_h.m: 4.4.2 | ckbs_nonlinear: Example Direct Measurement Model |
distance_h.m: 4.4.3 | ckbs_nonlinear: Example of Distance Measurement Model |
no_f.m: 4.4.4 | ckbs_nonlinear: Example of No Constraint |
persist_g.m: 4.4.5 | ckbs_nonlinear: Example of Persistence Transition Function |
pos_vel_g.m: 4.4.6 | ckbs_nonlinear: Example Position and Velocity Transition Model |
sine_f.m: 4.4.7 | ckbs_nonlinear: Example of Nonlinear Constraint |
[fk] = box_f(k, xk, params)
[fk, Fk] = box_f(k, xk, params)
lower = params.box_f_lower
upper = params.box_f_upper
index = params.box_f_index
ell = 2 * size(index, 1)
k
.
To be specific, for
p = 1 , ... , ell / 2
,
lower(p) <= xk( index(p) ) <= upper(p)
n
specifying a value for
the state vector at the current time index.
ell / 2
elements
specifying the state vector indices for which
there is a box constraints.
Each such index must be between one and
n
.
ell / 2
elements
specifying the lower limits for the box constraints.
ell / 2
elements
specifying the upper limits for the box constraints.
fk
is a column vector of length
ell
such that the condition
all( fk <= 0
is equivalent to
the constraints in the 4.4.1.c: purpose
above.
Fk
is an
ell x n
matrix equal to the
Jacobian of
fk
w.r.t
xk
.
function [fk, Fk] = box_f(k, xk, params)
index = params.box_f_index;
lower = params.box_f_lower;
upper = params.box_f_upper;
ell = 2 * size(lower, 1);
n = size(xk, 1);
%
if (size(lower, 2) ~= 1) | (size(upper, 2) ~= 1) | ...
(size(xk, 2) ~= 1)
size_lower_2 = size(lower, 2)
size_upper_2 = size(upper, 2)
size_index_2 = size(index, 2)
size_xk_2 = size(xk, 2)
error('box_f: either lower, upper, index or xk not a column vector')
end
if (2 * size(upper, 1) ~= ell) | (2 * size(index, 1) ~= ell)
size_lower_1 = size(lower, 1)
size_upper_1 = size(upper, 1)
size_index_1 = size(index, 1)
error('box_f: lower, upper, or index has a different size')
end
if (max(index) > n) | (min(index) < 1)
max_index = max(index)
min_index = min(index)
error('box_f: max(index) > size(xk, 1) or min(index) < 1')
end
%
fk = zeros(ell, 1);
Fk = zeros(ell, n);
for p = 1 : (ell / 2)
j = index(p);
fk(2 * p - 1, 1) = xk(j) - upper(p);
Fk(2 * p - 1, j) = +1.;
%
fk(2 * p, 1) = lower(p) - xk(j);
Fk(2 * p, j) = -1.;
end
return
end
[hk] = direct_h(k, xk, params)
[hk, Hk] = direct_h(k, xk, params)
index = params.direct_h_index
m = size(index, 1)
n = size(xk, 1)
i = 1 , ... , m
,
the mean of the i-th measurement given
xk
(the state at time index
k
) is
xk( index(i) )
n
specifying a value for
the state vector at the current time index.
xk
correspond to the direct measurements.
Each element of
index
must be between one and
n
.
hk
is a column vector of length
m
with i-th element equal to the mean of the
i-th measurement given
xk
.
Hk
is a
m x n
matrix equal to the
Jacobian of
hk
w.r.t
xk
.
function [hk, Hk] = direct_h(k, xk, params)
index = params.direct_h_index;
m = size(index, 1);
n = size(xk, 1);
if (size(xk, 2)~=1) | (size(index,2)~=1)
size_xk_2 = size(xk, 2)
size_index_2 = size(index, 2)
error('direct_h: xk or index is not a column vector')
end
if (max(index) > n) | (min(index) < 1)
max_index = max(index)
min_index = min(index)
error('direct_h: max(index) > size(xk, 1) or min(index) < 1')
end
hk = xk(index);
Hk = zeros(m, n);
for i = 1 : m
Hk( i, index(i) ) = 1;
end
return
end
[hk] = distance_h(k, xk, params)
[hk, Hk] = distance_h(k, xk)
index = distance_h.index
position = distance_h.position
n = size(xk, 1)
m = size(position, 2)
i = 1 , ... , m
,
the mean of the i-th measurement given
xk
(the state at time index
k
) is
the distance from
xk
to
position(:,i)
; i.e.
norm( xk(index) - position(:, i) )
index
specifies which components
of
xk
correspond to the current moving position which the distance
measurements are made to.
It must have the same number of rows as the matrix
position
.
position
with
m
columns.
Each column specifies a fixed locations that a
distance measurements are made from.
n
specifying a value for
the state vector at the current time index.
hk
is a column vector of length
m
with i-th element equal to the mean of the
distance from
position(:,i)
to
xk(index)
.
Hk
is a
m x n
matrix equal to the
Jacobian of
hk
w.r.t
xk
.
(In the special case where one of the distances is zero,
the corresponding row of
HK
is returned as zero.)
function [hk, Hk] = distance_h(k, xk, params)
index = params.distance_h_index;
position = params.distance_h_position;
n = size(xk, 1);
m = size(position, 2);
if (size(xk, 2)~=1) | (size(index,2)~=1)
size_xk_2 = size(xk, 2)
size_index_2 = size(index, 2)
error('distance_h: xk or index is not a column vector')
end
if size(position,1) ~= size(index,1)
size_position_1 = size(position, 1)
size_index_1 = size(index, 1)
error('distance_h: position and index have different number of rows')
end
hk = zeros(m, 1);
Hk = zeros(m, n);
p_x = xk(index);
for i = 1 : m
p_i = position(:, i);
hk(i) = norm( p_x - p_i );
if hk(i) ~= 0
Hk(i, index) = (p_x - p_i) / hk(i);
end
end
return
end
[fk] = no_f(k, xk)
[fk, Fk] = no_f(k, xk)
\[
f_k ( x_k ) \equiv -1
\]
so that the condition
f_k ( x_k ) \leq 0
is satisfied for all
x_k
.
xk
specifies the current state vector
(only used to determine the size of the state vector at each time point).
fk
is a scalar equal to minus one.
Fk
is a row vector equal to the Jacobian of
fk
w.r.t
xk
; i.e. zero.
function [fk, Fk] = no_f(k, xk)
% no constraint case: f(x) = -1 for all x
n = size(xk, 1);
fk = -1;
Fk = zeros(1, n);
return
end
[gk] = persist_g(k, xk1, params)
[gk, Gk] = persist_g(k, xk1, params)
initial = persist_g.initial
n = size(xk1, 1)
k
given the state at time
index
k-1
is its value at time index
k-1
.
(This corresponds to a random walk model.)
n
specifying the initial estimate
for the state vector at time index one.
k-1
.
k == 1
,
the return value
gk
is equal to
initial
.
Otherwise,
gk
is equal to
xk1
.
Gk
is an
n x n
matrix equal to the
Jacobian of
gk
w.r.t
xk1
; i.e., the identity matrix.
function [gk, Gk] = persist_g(k, xk1, params)
initial = params.persist_g_initial;
n = size(xk1, 1);
if k == 1
gk = initial;
Gk = zeros(n, n);
else
gk = xk1;
Gk = eye(n);
end
return
end
[gk] = pos_vel_g(k, xk1, params)
[gk, Gk] = pos_vel_g(k, xk1)
initial = params.pos_vel_initial
dt = params.pos_vel_g_dt
n = size(xk1, 1)
v(j) = xk1(2 * j - 1)
p(j) = xk1(2 * j)
j = 1 , ... , n / 2
,
p(j)
is a position estimate at time index
k-1
, and
v(j)
is the corresponding velocity estimate at time index
k-1
,
The mean for the corresponding position at time index
k
,
given
xk1
, is
p(j) + v(j) * dt
.
The mean for the corresponding velocity at time index
k
,
given
xk1
, is
v(j)
.
n
specifying the initial estimate
for the state vector at time index one.
k-1
.
k == 1
,
the return value
gk
is equal to
initial
.
Otherwise,
gk
is a column vector of length
n
equal to the mean for the state at time index
k
given the
value for the state at time index
k-1
; i.e.,
xk1
.
Gk
is an
n x n
matrix equal to the
Jacobian of
gk
w.r.t
xk1
.
function [gk, Gk] = pos_vel_g(k, xk1, params)
dt = params.pos_vel_g_dt;
initial = params.pos_vel_g_initial;
n = size(xk1, 1);
%
if (size(xk1, 2)~=1) | (size(initial, 1)~=n) | (size(initial, 2)~=1)
size_xk1_1 = size(xk1, 1)
size_initial_1 = size(initial, 1)
size_initial_2 = size(initial, 2)
error('pos_vel_g: initial or xk1 not column vectors with same size')
end
%
Gk = zeros(n, n);
if k == 1
gk = initial;
return;
end
% gk(2*j-1) = xk1(2*j-1)
Gk = eye(n);
for j2 = 2 : 2 : n
% gk(2*j) = xk1(2*j) + xk1(2*j-1) * dt
Gk(j2, j2-1) = dt;
end
gk = Gk * xk1;
return
end
[fk] = sine_f(k, xk, params)
[fk, Fk] = sine_f(k, xk, params)
index = params.sine_f_index
offset = params.sine_f_offset
xk( index(2) ) <= sin( xk( index(1) + offset(1) ) ) + offset(2)
n
specifying a value for
the state vector at the current time index.
n
.
fk
is a scalar equal to
xk( index(2) ) - sin( xk( index(1) + offset(1) ) ) - offset(2)
so that the condition
fk <= 0
is equivalent to
the constraints in the purpose above.
Fk
is a row vector with
n
elements
equal to the derivative of
fk
w.r.t
xk
.
function [fk, Fk] = sine_f(k, xk, params)
n = size(xk, 1);
index = params.sine_f_index;
offset = params.sine_f_offset;
if (size(index, 1) ~= 2) | (size(index, 2) ~= 1)
size_index_1 = size(index, 1)
size_index_2 = size(index, 2)
error('sine_f: index is not a column vector with two elements')
end
if (size(offset, 1) ~= 2) | (size(offset, 2) ~= 1)
size_offset_1 = size(offset, 1)
size_offset_2 = size(offset, 2)
error('sine_f: offset is not a column vector with two elements')
end
if (max(index) > n) | (min(index) < 1)
max_index = max(index)
min_index = min(index)
error('sine_f: max(index) > size(xk, 1) or min(index) < 1')
end
fk = xk( index(2) ) - sin( xk( index(1) + offset(1) ) ) - offset(2);
Fk = zeros(1, n);
Fk(index(2)) = 1;
Fk(index(1)) = Fk(index(1)) - cos( xk( index(1) + offset(1) ) );
return
end
[xOut, rOut, sOut, pPlusOut, pMinusOut, info] = ckbs_L1_nonlinear(
f_fun, g_fun, h_fun, ...
max_itr, epsilon, x_in, z, qinv, rinv, level)
x
is defined by
x = x_1, \ldots, x_N
,
with each
x_i \in \B{R}^n
.
The robust L1 Kalman-Bucy smoother objective is defined by
\[
\begin{array}{rcl}
S ( x ) & = & S^Q ( x ) + S^R ( x )
\\
S^Q ( x ) & = & \frac{1}{2} \sum_{k=1}^N
( x_k - g_k ( x_{k-1} ) )^\R{T} * Q_k^{-1} * ( x_k - g_k ( x_{k-1} ) )
\\
S^R( x ) & = & \sqrt{2} \sum_{k=1}^N \|R_k^{-1/2}( z_k - h_k ( x_k ) )\|_1
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Note that
g_1
is the initial state estimate
and
Q_1
is the corresponding covariance.
\[
{\rm minimize}\;
S( x ) \;
{\rm w.r.t.} \; x
\]
d = d_1, \ldots, d_N
with each
d_i \in \B{R}^n
.
Define
\[
\begin{array}{rcl}
\tilde{S} ( x ; d) & = & \tilde{S}^Q ( x ; d ) + \tilde{S}^R ( x
; d )
\\
\tilde{S}^Q ( x ; d ) & = & \frac{1}{2} \sum_{k=1}^N
( x_k - g_k ( x_{k-1} ) - \partial_k g_k ( x_{k-1} ) d_k ) ^\R{T} * Q_k^{-1} * ( x_k - g_k (
x_{k-1} ) - \partial_k g_k ( x_{k-1} ) d_k )
\\
\tilde{S}^R( x ; d ) & = & \sqrt{2} \sum_{k=1}^N
\|R_k^{-1/2}( z_k - h_k ( x_k ) - \partial_k h_k ( x_k) d_k )\|_1
\end{array}
\]
A state sequence
x
is considered a
solution if
\[
S( x ) - \min_d\; \tilde{S} ( x ; d ) < \varepsilon \;.
\]
This follows directly from the convex composite structure of
S
.
ckbs_L1_nonlinear
argument
g_fun
is a function that supports both of
the following syntaxes
[gk] = feval(g_fun, k, xk1)
[gk, Gk] = feval(g_fun, k, xk1)
g_fun
argument
k
is an integer with
1 \leq k \leq N
.
The case
k = 1
serves to specify the initial state estimate.
g_fun
argument
xk1
is an column vector with
length
n
. It specifies the state vector at time index
k
\[
xk1 = x_{k-1}
\]
.
In the case
k = 1
, the value of
xk1
does not matter.
g_fun
result
gk
is an column vector with
length
n
and
\[
gk = g_k ( x_{k-1} )
\]
In the case
k = 1
, the value of
gk
is the initial state
estimate at time index
k
.
g_fun
result
Gk
is present in the syntax,
it is the
n \times n
matrix given by
and
\[
Gk = \partial_{k-1} g_k ( x_{k-1} )
\]
In the case
k = 1
, the value of
Gk
is the zero matrix;
i.e.,
g_k
does not depend on the value of
x_{k-1}
.
ckbs_L1_nonlinear
argument
h_fun
is a function that supports both of the
following syntaxes
[hk] = feval(h_fun, k, xk)
[hk, Hk] = feval(h_fun, k, xk)
h_fun
argument
k
is an integer with
1 \leq k \leq N
.
h_fun
argument
xk
is an column vector with
length
n
. It specifies the state vector at time index
k
\[
xk = x_k
\]
.
h_fun
result
hk
is an column vector with
length
m
and
\[
hk = h_k ( x_k )
\]
h_fun
result
Hk
is present in the syntax,
it is the
m \times n
matrix given by
and
\[
Hk = \partial_k h_k ( x_k )
\]
max_itr
specifies the maximum number of
iterations of the algorithm to execute. It must be greater than or
equal to zero. Note that if it is zero, the first row of the
5.u: info
return value will still be computed.
This can be useful for deciding what is a good value for the argument
5.j: epsilon
.
ckbs_L1_nonlinear
argument
epsilon
is a positive scalar.
It specifies the convergence
criteria value; i.e.,
\[
\varepsilon = epsilon
\]
ckbs_L1_nonlinear
argument
x_in
is a two dimensional array with size
n \times N
.
It specifies a sequence of state values; i.e.,
for
k = 1 , \ldots , N
\[
x\_in (:, k) = x_k
\]
The closer the initial state sequence is to the solution
the faster, and more likely, the ckbs_L1_nonlinear
will converge.
ckbs_L1_nonlinear
argument
z
is a two dimensional array
with size
m \times N
.
It specifies the sequence of measurement vectors; i.e.,
for
k = 1 , \ldots , N
\[
z(:, k) = z_k
\]
ckbs_L1_nonlinear
argument
qinv
is a three dimensional array
with size
n \times n \times N
.
It specifies the inverse of the variance of the measurement noise; i.e.,
for
k = 1 , \ldots , N
\[
qinv(:,:, k) = Q_k^{-1}
\]
In the case
k = 1
, the value of
Q_k
is the variance
of the initial state estimate (see 5.g: g_fun
.
ckbs_L1_nonlinear
argument
rinv
is a three dimensional array,
with size
m \times m \times N
.
it specifies the inverse of the variance of the transition noise; i.e.,
for
k = 1 , \ldots , N
\[
rinv(:,:, k) = R_k^{-1}
\]
It is ok to signify a missing data value by setting the corresponding
row and column of
rinv
to zero. This also enables use
to interpolate the state vector
x_k
to points where
there are no measurements.
level
specifies the level of tracing to do.
If
level == 0
, no tracing is done.
If
level >= 1
, the row index
q
and the correspond row of
info
are printed as soon as soon as they are computed.
If
level >= 2
, a check of the derivative calculations
is printed before the calculations start. In this case, control will
return to the keyboard so that you can print values, continue,
or abort the calculation.
x_out
is a two dimensional array with size
n \times N
.
( x_1 , \ldots , x_N )
.
It contains an approximation for the optimal sequence; i.e.,
for
k = 1 , \ldots , N
\[
x\_out(:, k) = x_k
\]
rOut
contains the structure
rOut
reported by 8: ckbs_L1_affine
for the last affine
subproblem.
sOut
contains the structure
sOut
reported by 8: ckbs_L1_affine
for the last affine
subproblem.
pPlusOut
contains the structure
pPlusOut
reported by 8: ckbs_L1_affine
for the last affine
subproblem.
pMinusOut
contains the structure
pMinusOut
reported by 8: ckbs_L1_affine
for the last affine
subproblem.
info
is a matrix with each row corresponding
to a solution of 8: ckbs_L1_affine
on an affine subproblem.
There are five numbers output per iteration:
the infinity norm of the Kuhn-Tucker conditions,
the one-norm of the Kuhn-Tucker conditions, the affine improvement,
\mu
, and the number of line search iterations.
See 8: ckbs_L1_affine
for the definitions of these quantities.
Note that ckbs_L1_affine
has satisfied the convergence
condition if the affine improvement is close to zero, that is,
info(end, 3) <= epsilon
5.v: Example
ckbs_L1_nonlinear
estimating the position of a
Van Der Pol oscillator (an oscillator with non-linear dynamics).
The code runs both 4: ckbs_nonlinear
and 5: ckbs_L1_nonlinear
on the example, in cases where there are
outliers in the data, and can be used to make a plot.
It returns true if ckbs_L1_nonlinear
converged on the problem,
and false otherwise.
[ok] = L1_nonlinear_ok(draw_plot)
draw_plot
is not present or false, no plot is drawn.
ok
is true, the test passed,
otherwise the test failed.
x1 (t)
and
x2 (t)
to denote the oscillator position and velocity as a function of time.
The deterministic ordinary differential equation for the
Van der Pol oscillator is
\[
\begin{array}{rcl}
x1 '(t) & = & x2 (t)
\\
x2 '(t) & = & \mu [ 1 - x1(t)^2 ] x2 (t) - x1(t)
\end{array}
\]
\Delta t = .1
, together with Gaussian noise.
Given
x_{k-1}
, the value of the state at time
t_{k-1}
,
we obtain
x_k
, the state value at time
t_k = t_{k-1} + \Delta t
,
by
\[
\begin{array}{rcl}
x_{1,k} & = & x_{1,k-1} + x_{2,k-1} \Delta t + w_{1,k}
\\
x_{2,k} & = & x_{2,k-1} +
[ \mu ( 1 - x_{1,k-1}^2 ) x_{2,k-1} - x_{1,k-1} ] \Delta t
+ w_{2,k}
\end{array}
\]
where
w_k \sim \B{N}( 0 , Q_k )
and the variance
Q_k = \R{diag} ( 0.01, 0.01 )
.
The routine 4.3.2: vanderpol_g.m
calculates the mean for
x_k
given
x_{k-1}
.
The initial state estimate is
[0, -.5]
, and its variance
estimate is
Q_1 = \R{diag} ( .01 , .01 )
.
\[
\begin{array}{rcl}
z_{1,k} & = & x_{1,k} + v_{1,k}
\end{array}
\]
where
v_k
is a mixture of nominal Gaussian noise
and large outliers, so
v_k \sim 0.8 \B{N}( 0 , R_k ) + 0.2 \B{N}( 0 , 100 )
and
the measurement variance is a scalar
R_k = 1
.
The routine 4.4.2: direct_h.m
calculates the mean for
z_k
,
given the value of
x_k
; i.e.
x_{1,k}
.
function [ok]= L1_nonlinear_ok(draw_plot)
ok = 1;
conVar = 100;
conFreq = .2;
numComp = 1;
noise = 'normal';
if nargin < 1
draw_plot = false;
end
% ---------------------------------------------------------------
% Simulation problem parameters
max_itr = 100; % Maximum number of optimizer iterations
epsilon = 1e-6; % Convergence criteria
N = 41; % Number of measurement points (N > 1)
T = 4.; % Total time
ell = 1; % Number of constraints (never active)
n = 2; % Number of components in the state vector (n = 2)
m = 1; % Number of measurements at each time point (m <= n)
xi = [ 0. , -.5]'; % True initial value for the state at time zero
x_in = zeros(n, N); % Initial sequence where optimization begins
% SASHA: changed 0, .5 to 0 0
x_est = [ 0 , -.5]'; % Estimate of initial state (at time index one)
mu = 2.0; % ODE nonlinearity parameter
sigma_r = 1.; % Standard deviation of measurement noise
sigma_q = .1; % Standard deviation of the transition noise
sigma_qa = .1; % 'Actual' noise
% Step size for fourth order Runge-Kutta method in vanderpol_sim
% is the same as time between measurement points
step = T / (N-1);
% SASHA: changed .1 to 1
sigma_e = sqrt(.1); % Standard deviation of initial state estimate
rand('seed', 4321); % Random number generator seed
% Level of tracing during optimization
if draw_plot
level = 1;
else
level = 0;
end
if (strcmp(noise,'student'))
sigma_r = sqrt(2);
end
% Plotting parameters
h_min = 0; % minimum horizontal value in plots
h_max = T; % maximum horizontal value in plots
v_min = -5.0; % minimum vertical value in plots
v_max = 5.0; % maximum vertical value in plots
%________________
f_fun = @ no_f;
% -------------------------------------------------------------------
% information used by vanderpol_g
params.vanderpol_g_initial = xi;
params.vanderpol_g_dt = step;
params.vanderpol_g_mu = mu;
g_fun = @(k, xk) vanderpol_g(k, xk, params);
% -------------------------------------------------------------------
% information used by direct_h
params.direct_h_index = 1;
h_fun = @(k,x) direct_h(k,x,params);
% ---------------------------------------------------------------
% ---------------------------------------------------------------
% Rest of the information required by ckbs_nonlinear
%
% Simulate the true values for the state vector
x_vdp = vanderpol_sim(mu, xi, N, step);
time = linspace(0., T, N);
% x_trueLong = zeros(n, 10*N);
% x_true = zeros(n, N);
% vanderpol_info.N = 10*N;
% x_trueLong(:,1) = xi; % + sigma_e*randn(n, 1);
% for k = 2:10*N
% x_trueLong(:,k) = vanderpol_g(k, x_trueLong(:,k-1)) + sqrt(.1*sigma_q)*randn(n,1);
% end
% x_true = x_trueLong(:, [1 10:10:10*N-10]);
% vanderpol_info.N = N;
x_true = zeros(n, N);
x_true(:,1) = xi;% + .1*sigma_e*randn(n, 1);
for k = 2:N
x_true(:,k) = g_fun(k, x_true(:,k-1)) + sigma_qa*randn(n,1);
end
% Inverse covariance of the measurement and transition noise
rinv = zeros(m, m, N);
qinv = zeros(n, n, N);
for k = 1 : N
rinv(:, :, k) = eye(m, m) / sigma_r^2;
qinv(:, :, k) = eye(n, n) / sigma_q^2;
end
qinv(:, :, 1) = eye(n, n) / sigma_e^2;
% call back functions
v_true = zeros(m,N);
temp = zeros(m,N);
bin = zeros(n,N);
iter = 0;
while(iter <= numComp)
iter = iter + 1;
if(mod(iter, 100) == 0)
fprintf('iter conVar conFreq\n');
fprintf('%d %1.4f %1.4f\n', iter, conVar, conFreq);
end
randn('state', sum(100*clock));
rand('twister', sum(100*clock));
switch(noise)
case{'normal'}
v_true = sigma_r * randn(m, N);
case{'laplace'}
temp = sigma_r*exprnd(1/sigma_r, m, N);
bin = 2*(binornd(1, 0.5, m, N)-0.5);
v_true = temp.*bin/sqrt(2);
case{'student'}
v_true = trnd(4,m,N);
end
% Create contaminating distribution
nums = rand(1, N);
biNums = nums < conFreq;
% nI= ones(m,1)*binornd(1, conFreq, 1, N); % contaminate entire measurement, not just componentwise
nI = ones(m,1)*biNums;
newNoise = sqrt(conVar)*randn(m,N); % Contaminating noise
%M = 50;
%newNoise = unidrnd(M, 1, N) - M/2;
% Simulate the measurement values
%measurement = x_true(1:m,:).*x_true(1:m, :); % measurement is x^2
measurement =x_true(1,:); % measurement is x^2
z = measurement + v_true + nI.*newNoise;
%addNoise = v_true + nI.*newNoise;
%z = addNoise + x_true(1, :) + 0.5*(x_true(2,:) - 0).^2 ;
%z1 = addNoise(1, :) + x_true(1, :);s
%z2 = addNoise(2, :) + 0.5*x_true(1, :).*x_true(1, :) + 0.5*x_true(2,:).*x_true(2,:);
%z = [z1; z2];
%z = addNoise + (1/3)^3*(x_true(2,:)).^3 + x_true(2);
% ---------------------------------------------------------------
% call the optimizer
%addpath('../src');
if(draw_plot)
ckbs_level = 0;
[ x_out_Gauss , u_out, info ] = ckbs_nonlinear( ...
f_fun , ...
g_fun , ...
h_fun , ...
max_itr , ...
epsilon , ...
x_in , ...
z , ...
qinv , ...
rinv , ...
ckbs_level ...
);
end
[ x_out_L1 , rOut, sOut, pPlusOut, pMinusOut, info ] = ckbs_L1_nonlinear( ...
f_fun , ...
g_fun , ...
h_fun , ...
max_itr , ...
epsilon , ...
x_in , ...
z , ...
qinv , ...
rinv , ...
level ...
);
ok = ok && info(end, 3) < epsilon;
end
% % ----------------------------------------------------------------------
if draw_plot
figure(1);
clf
subplot(2, 1,1)
hold on
plot(time, x_vdp(1,:), 'r-.', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
plot(time, x_true(1,:), 'k-', 'Linewidth', 3.5, 'MarkerEdgeColor', 'k');
plot(time, x_out_Gauss(1,:), 'b--', 'Linewidth', 3.5, 'MarkerEdgeColor', 'k');
plot(time, x_out_L1(1,:), 'm-.', 'Linewidth', 3.5, 'MarkerEdgeColor', 'k');
meas = z(1,:);
meas(meas > v_max) = v_max;
meas(meas < v_min) = v_min;
%plot(time, z(1,:), 'o');
plot(time, meas, 'bo', 'MarkerFaceColor', [.49 1 .63], 'MarkerSize', 6.2);
axis([h_min, h_max, v_min, v_max]);
hleg = legend('vdp solution', 'stochastic realization', 'Gaussian Smoother', 'L1 Smoother', 'Data', 'Location', 'NorthWest');
xlabel('Time (s)', 'FontSize', 16);
ylabel('X-component', 'FontSize', 16);
set(hleg, 'Box', 'off');
set(gca,'FontSize',16);
set(hleg, 'FontSize', 14);
title('20% of measurement errors are N(0, 100)');
%plot(time, z(1,:), 'o');
hold off
subplot(2,1, 2)
hold on
plot(time, x_vdp(2,:), 'r-.', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
plot(time, x_true(2,:), 'k-', 'Linewidth', 3.5, 'MarkerEdgeColor', 'k');
plot(time, x_out_Gauss(2,:), 'b--', 'Linewidth', 3.5, 'MarkerEdgeColor', 'k');
plot(time, x_out_L1(2,:), 'm-.', 'Linewidth', 3.5, 'MarkerEdgeColor', 'k');
axis([h_min, h_max, v_min, v_max]);
hleg = legend('vdp solution', 'stochastic realization', 'Gaussian Smoother', 'L1 Smoother', 'Location', 'NorthWest');
xlabel('Time(s)', 'FontSize', 16);
ylabel('Y-component', 'FontSize', 16);
set(hleg, 'Box', 'off');
set(gca,'FontSize',16);
set(hleg, 'FontSize', 14);
hold off
return
end
end
[xOut, uOut, info] = ckbs_affine(max_itr, epsilon,
z, b, g, h, db, dg, dh, qinv, rinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( z_k - h_k - H_k * x_k )^\R{T} * R_k^{-1} * ( z_k - h_k - H_k * x_k )
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Note that
g_1
is the initial state estimate
and
Q_1
is the corresponding covariance.
\[
\begin{array}{rll}
{\rm minimize}
& S( x_1 , \ldots , x_N )
& {\rm w.r.t.} \; x_1 \in \B{R}^n , \ldots , x_N \in \B{R}^n
\\
{\rm subject \; to}
& b_k + B_k * x_k \leq 0
& {\rm for} \; k = 1 , \ldots , N
\end{array}
\]
( x_1 , \ldots , x_N )
is considered a solution
if there is a Lagrange multiplier sequence
( u_1 , \ldots , u_N )
such that the following conditions are satisfied.
\[
\begin{array}{rcl}
b_k + B_k * x_k & \leq & \varepsilon \\
0 & \leq & u_k \\
| ( B_k^T * u_k + d_k )_j | & \leq & \varepsilon \\
| (u_k)_i * ( b_k + B_k * x_k)_i | & \leq & \varepsilon
\end{array}
\]
for
j = 1 , \ldots , n
,
i = 1 , \ldots , \ell
, and
k = 1 , \ldots , N
.
Here
d_k
is the partial derivative of
S ( x_1 , \ldots , x_N )
with respect to
x_k
and
(u_k)_i
denotes the i-th component of
u_k
.
max_itr
specifies the maximum number of
iterations of the algorithm to execute. It must be greater than or
equal to zero. Note that if it is zero, the first row of the
6.s: info
return value will still be computed.
This can be useful for deciding what is a good value for the argument
6.g: epsilon
.
epsilon
specifies the convergence
criteria value; i.e.,
\[
\varepsilon = epsilon
\]
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
b
is a two dimensional array,
for
k = 1 , \ldots , N
\[
b_k = b(:, k)
\]
and
b
has size
\ell \times N
.
If
\ell = 0
, the problem is not constrained; i.e.,
it is the affine Kalman-Bucy smoother problem.
g
is a two dimensional array,
for
k = 1 , \ldots , N
\[
g_k = g(:, k)
\]
and
g
has size
n \times N
.
The value
g_1
serves as the initial state estimate.
h
is a two dimensional array,
for
k = 1 , \ldots , N
\[
h_k = h(:, k)
\]
and
h
has size
m \times N
.
db
is a three dimensional array,
for
k = 1 , \ldots , N
\[
B_k = db(:,:,k)
\]
and
db
has size
\ell \times n \times N
.
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
The initial state estimate
g_1
does not depend on the value of
x_0
, hence
G_1
should be zero.
dh
is a three dimensional array,
for
k = 1 , \ldots , N
\[
H_k = dh(:,:,k)
\]
and
dh
has size
m \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:, k)
\]
and
qinv
has size
n \times n \times N
.
The value of
Q_k
is the variance of the initial state
estimate
g_1
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:, k)
\]
and
rinv
has size
m \times m \times N
.
It is ok to signify a missing data value by setting the corresponding
row and column of
rinv
to zero. This also enables use
to interpolate the state vector
x_k
to points where
there are no measurements.
xOut
contains the optimal sequence
( x_1 , \ldots , x_N )
.
For
k = 1 , \ldots , N
\[
x_k = xOut(:, k)
\]
and
xOut
is a two dimensional array with size
n \times N
.
uOut
contains the Lagrange multiplier sequence
corresponding to
xOut
.
For
k = 1 , \ldots , N
\[
u_k = uOut(:, k)
\]
and
uOut
is a two dimensional array with size
\ell \times N
.
The pair
xOut
,
uOut
satisfy the
6.e: first order conditions
.
info
is a matrix with each row corresponding
to an iteration of the algorithm.
Note that ckbs_affine
has satisfied the convergence condition if
and only if
all( info(end, 1:3) <= epsilon )
We use
x_k^q
(
u_k^q
)
to denote the state vector (dual vector)
for time point
k
and at the end of iteration
q-1
.
We use
d_k^q
for the partial derivative of
S ( x_1^q , \ldots , x_N^q )
with respect to
x_k
.
info(q, 1)
is
a bound on the constraint functions
at the end of iteration
q-1
. To be specific
\[
info(q, 1) = \max_{i,k} ( b_k + B_k * x_k )_i
\]
info(q, 2)
is
a bound on the partial derivative of the Lagrangian with respect to
the state vector sequence
at the end of iteration
q-1
:
\[
info(q, 2) = \max \left[ | ( B_k^T * u_k + d_k )_j | \right]
\]
where the maximum is with respect to
j = 1 , \ldots , n
and
k = 1 , \ldots , N
.
info(q, 3)
is
a bound on the complementarity conditions
at the end of iteration
q-1
:
\[
info(q, 3) = \max \left[ | (u_k)_i * ( b_k + B_k * x_k)_i | \right]
\]
where the maximum is with respect to
i = 1 , \ldots , \ell
and
k = 1 , \ldots , N
.
info(q, 4)
is the line search step size used during
iteration
q-1
.
Small step sizes indicate problems with the interior point method
used to solve the affine problem
(with the exception that
info(1, 4)
is always zero).
ckbs_affine
.
It returns true if ckbs_affine
passes the test
and false otherwise.
[ok] = affine_ok_box(draw_plot)
nonlinear_ok_box.out
file is written for use with
the program nonlinear_ok_box.r
.
ok
is true, the test passed,
otherwise the test failed.
x1 (t)
| derivative of function we are estimating |
x2 (t)
| value of function we are estimating |
z1 (t)
value of
x2 (t)
plus noise
\[
\begin{array}{c}
-1 \leq x1 (t) \leq +1
\\
-1 \leq x2 (t) \leq +1
\end{array}
\]
.
function [ok] = affine_ok_box(draw_plot)
% --------------------------------------------------------
% You can change these parameters
N = 50; % number of measurement time points
dt = 2*pi / N; % time between measurement points
gamma = 1; % transition covariance multiplier
sigma = .5; % standard deviation of measurement noise
max_itr = 30; % maximum number of iterations
epsilon = 1e-5; % convergence criteria
h_min = 0; % minimum horizontal value in plots
h_max = 7; % maximum horizontal value in plots
v_min = -2.0; % minimum vertical value in plots
v_max = +2.0; % maximum vertical value in plots
% ---------------------------------------------------------
ok = true;
if nargin < 1
draw_plot = false;
end
% Define the problem
rand('seed', 1234);
%
% number of constraints per time point
ell = 4;
%
% number of measurements per time point
m = 1;
%
% number of state vector components per time point
n = 2;
%
% simulate the true trajectory and measurement noise
t = (1 : N) * dt;
x1_true = - cos(t);
x2_true = - sin(t);
x_true = [ x1_true ; x2_true ];
v_true = sigma * rand(1, N);
%
% measurement values and model
v_true = sigma * randn(1, N);
z = x2_true + v_true;
rk = sigma * sigma;
rinvk = 1 / rk;
rinv = zeros(m, m, N);
h = zeros(m, N);
dh = zeros(m, n, N);
for k = 1 : N
rinv(:, :, k) = rinvk;
h(:, k) = 0;
dh(:,:, k) = [ 0 , 1 ];
end
%
% transition model
g = zeros(n, N);
dg = zeros(n, n, N);
qinv = zeros(n, n, N);
qk = gamma * [ dt , dt^2/2 ; dt^2/2 , dt^3/3 ];
qinvk = inv(qk);
for k = 2 : N
g(:, k) = 0;
dg(:,:, k) = [ 1 , 0 ; dt , 1 ];
qinv(:,:, k) = qinvk;
end
%
% initial state estimate
g(:, 1) = x_true(:, 1);
qinv(:,:, 1) = 100 * eye(2);
%
% constraints
b = zeros(ell, N);
db = zeros(ell, n, N);
for k = 1 : N
%
% b(:, k) + db(:,:, k) * x(:, k) <= 0
b(:, k) = [ -1 ; -1 ; -1 ; -1 ];
db(:,:, k) = [ -1 , 0 ; 1 , 0 ; 0 , -1 ; 0 , 1 ];
end
%
% -------------------------------------------------------------------------
[xOut, uOut, info] = ...
ckbs_affine(max_itr, epsilon, z, b, g, h, db, dg, dh, qinv, rinv);
% --------------------------------------------------------------------------
ok = ok & all( info(end, 1:3) <= epsilon);
d = ckbs_sumsq_grad(xOut, z, g, h, dg, dh, qinv, rinv);
for k = 1 : N
xk = xOut(:, k);
uk = uOut(:, k);
bk = b(:, k);
Bk = db(:,:, k);
dk = d(:, k);
sk = - bk - Bk * xk;
%
ok = ok & (min(uk) >= 0.);
ok = ok & (max (bk + Bk * xk) <= epsilon);
ok = ok & (max ( abs( Bk' * uk + dk ) ) <= epsilon);
ok = ok & (max ( uk .* sk ) <= epsilon );
end
if draw_plot
figure(1);
clf
hold on
plot(t', x_true(2,:)', 'r-' );
plot(t', z(1,:)', 'ko' );
plot(t', xOut(2,:)', 'b-' );
plot(t', - ones(N,1), 'b-');
plot(t', ones(N,1), 'b-');
axis([h_min, h_max, v_min, v_max]);
title('Constrained');
hold off
%
% constrained estimate
x_con = xOut;
end
%
% Unconstrained Case
b = zeros(0, N);
db = zeros(0, n, N);
% -------------------------------------------------------------------------
[xOut, uOut, info] = ...
ckbs_affine(max_itr, epsilon, z, b, g, h, db, dg, dh, qinv, rinv);
% --------------------------------------------------------------------------
ok = ok & all( info(end, 1:3) <= epsilon);
d = ckbs_sumsq_grad(xOut, z, g, h, dg, dh, qinv, rinv);
for k = 1 : N
xk = xOut(:, k);
dk = d(:, k);
%
ok = ok & (min(dk) <= epsilon);
end
if draw_plot
figure(2);
clf
hold on
plot(t', x_true(2,:)', 'r-' );
plot(t', z(1,:)', 'ko' );
plot(t', xOut(2,:)', 'b-' );
plot(t', - ones(N,1), 'b-');
plot(t', ones(N,1), 'b-');
axis([h_min, h_max, v_min, v_max]);
title('Unconstrained');
hold off
%
% unconstrained estimate
x_free = xOut;
%
% write out constrained and unconstrained results
[fid, msg] = fopen('affine_ok_box.out', 'wt');
if size(msg, 2) > 0
disp(['affine_ok: ', msg]);
end
% 123456789012345678901234'
heading = ' t';
heading = [ heading , ' x2_true x2_con x2_free' ];
heading = [ heading , ' z1\n' ];
fprintf(fid, heading);
for k = 1 : N
fprintf(fid,'%8.3f%8.3f%8.3f%8.3f%8.3f\n', ...
t(k), ...
x_true(2,k), x_con(2,k), x_free(2,k), ...
z(1,k) ...
);
end
fclose(fid);
end
return
end
[xOut, info] = ckbs_affine_singular(...
z, g, h, ...
dg, dh, q, r)
Q
and
R
may be singular.
\[
\begin{array}{rcl}
G^{-1} (w - Q G^{-T}H^T\Phi^{-1}(HG^{-1}w-v))
\end{array}
\]
where
\[
\Phi = HG^{-1}QG^{-T}H^T + R ,
\]
and the matrices
R_k
and
Q_k
are
symmetric positive semidefinite.
Note that
g_1
is the initial state estimate
and
Q_1
is the corresponding covariance.
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g
is a two dimensional array,
for
k = 1 , \ldots , N
\[
g_k = g(:, k)
\]
and
g
has size
n \times N
.
The value
g_1
serves as the initial state estimate.
h
is a two dimensional array,
for
k = 1 , \ldots , N
\[
h_k = h(:, k)
\]
and
h
has size
m \times N
.
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
The initial state estimate
g_1
does not depend on the value of
x_0
, hence
G_1
should be zero.
dh
is a three dimensional array,
for
k = 1 , \ldots , N
\[
H_k = dh(:,:,k)
\]
and
dh
has size
m \times n \times N
.
q
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k = q(:,:, k)
\]
and
q
has size
n \times n \times N
.
The value of
Q_k
is the variance of the initial state
estimate
g_1
.
r
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k = r(:,:, k)
\]
and
r
has size
m \times m \times N
.
It is ok to signify a missing data value by setting the corresponding
row and column of
r
to infinity. This also enables use
to interpolate the state vector
x_k
to points where
there are no measurements.
xOut
contains the optimal sequence
( x_1 , \ldots , x_N )
.
For
k = 1 , \ldots , N
\[
x_k = xOut(:, k)
\]
and
xOut
is a two dimensional array with size
n \times N
.
7.m: Example
ckbs_affine_singular
.
It returns true if ckbs_affine_singular
passes the test
and false otherwise.
[ok] = affine_singular_ok(draw_plot)
ok
is true, the test passed,
otherwise the test failed.
x1 (t)
| derivative of function we are estimating |
x2 (t)
| value of function we are estimating |
z1 (t)
value of
x2 (t)
plus noise
function [ok] = affine_singular_ok(draw_plot)
% --------------------------------------------------------
ok = true;
% You can change these parameters
conVar = 100; % variance of contaminating distribution
conFreq = .3; % frequency of outliers
noise = 'normal'; % type of noise
N = 1000; % number of measurement time points
dt = 10*pi / N; % time between measurement points
%dt = 1;
gamma = 1; % transition covariance multiplier
sigma = 1; % standard deviation of measurement noise
sigma_alg = 1;
if (strcmp(noise,'student'))
sigma = sqrt(2);
end
max_itr = 30; % maximum number of iterations
epsilon = 1e-4; % convergence criteria
h_min = 0; % minimum horizontal value in plots
h_max = 14; % maximum horizontal value in plots
v_min = -5.0; % minimum vertical value in plots
v_max = 5.0; % maximum vertical value in plots
% ---------------------------------------------------------
if nargin < 1
draw_plot = false;
end
% Define the problem
%rand('seed', 12);
%
% number of constraints per time point
% ell = 0;
%
% number of measurements per time point
m = 1;
%
% number of state vector components per time point
n = 2;
%
% simulate the true trajectory and measurement noise
t = (1 : N) * dt;
x1_true = - cos(t);
x2_true = - sin(t);
x_true = [ x1_true ; x2_true ];
% measurement values and model
%exp_true = random('exponential', 1/sigma, 1, N);
%bin = 2*random('binomial', 1, 0.5, 1, N) - 1;
%exp_true = exp_true .* bin;
%z = x2_true + exp_true;
% Normal Noise
v_true = zeros(m,N);
temp = zeros(m,N);
bin = zeros(m,N);
randn('state', sum(100*clock));
rand('twister', sum(100*clock));
if (strcmp(noise, 'normal'))
v_true = sigma * randn(m, N);
end
%Laplace Noise
if (strcmp(noise,'laplace'))
temp = sigma*exprnd(1/sigma, 1,100);
bin = 2*(binornd(1, 0.5, 1, 100)-0.5);
v_true = temp.*bin/sqrt(2);
end
if (strcmp(noise,'student'))
v_true = trnd(4,1,100);
end
z = [x2_true] + v_true;
rk = sigma_alg * sigma_alg * eye(m);
rinvk = inv(rk);
rinv = zeros(m, m, N);
r = zeros(m, m, N);
h = zeros(m, N);
% Covariance between process and measurement?
dh = zeros(m, n, N);
for k = 1 : N
rinv(:, :, k) = rinvk;
r(:,:,k) = rk;
h(:, k) = 0;
dh(:,:, k) = [ 0 , 1 ];
end
%
% transition model
g = zeros(n, N);
dg = zeros(n, n, N);
qinv = zeros(n, n, N);
q = zeros(n, n, N);
qk = gamma * [ dt , 0 ; 0 , 0 ];
qinvk = inv(qk);
for k = 2 : N
g(:, k) = 0;
dg(:,:, k) = [ 1 , 0 ; dt , 1 ];
qinv(:,:, k) = qinvk;
q(:,:,k) = qk;
end
%
% initial state estimate
g(:, 1) = x_true(:, 1);
qinv(:,:, 1) = 100 * eye(2);
q(:,:,1) = inv(qinv(:,:,1));
%
% constraints
b = zeros(0, N);
db = zeros(0, n, N);
% -------------------------------------------------------------------------
% -------------------------------------------------------------------------
% L1 Kalman-Bucy Smoother
% Linear unconstrained Kalman-Bucy Smoother
[xOut2, info] = ckbs_affine_singular(z, g, h, dg, dh, q, r);
%qk = gamma * [ dt , dt^2/2 ; dt^2/2 , dt^3/3 ];
%for k = 2:N
% q(:,:,k) = qk;
%end
%[xOut3, info] = ckbs_affine_singular(z, g, h, dg, dh, q, r);
iter = info(5);
info(5) = 0;
iter
ok = (ok) && (max(info) < epsilon);
% % --------------------------------------------------------------------------
%
if draw_plot
qk = gamma * [ dt , dt^2/2 ; dt^2/2 , dt^3/3 ];
qinvk = inv(qk);
for k = 2 : N
qinv(:,:, k) = qinvk;
end
[xOut3, uOut, info] = ckbs_affine(max_itr, epsilon, z, b, g, h, db, dg, dh, qinv, rinv);
figure(1);
clf
hold on
plot(t', x_true(2,:)', '-k', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
meas = z(1,:);
meas(meas > v_max) = v_max;
meas(meas < v_min) = v_min;
plot(t', meas, 'bd', 'MarkerFaceColor', [.49 1 .63], 'MarkerSize', 6);
plot(t', xOut2(2,:)', '-r', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
% plot(t', xOut3(2,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
axis([h_min, h_max, v_min, v_max]);
title('Position Plot. Black=truth, Blue = meas, Magenta = Singular Smoother.');
hold off
figure(2);
clf
hold on
plot(t', x_true(1,:)', '-k', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
plot(t', xOut2(1,:)', '-r', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
plot(t', xOut3(1,:)', '-b', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
axis([h_min, h_max, v_min, v_max]);
title('Derivative plot. Black=truth, Blue = SDE smoother, Magenta = Singular Smoother.');
%
% constrained estimate
end
% [norm(xOut2(2,:) - x_true(2,:)) norm(xOut3(2,:) - x_true(2,:))]
end
[xOut, rOut, sOut, pPlusOut, pMinusOut, info] = ckbs_L1_affine(...
max_itr, epsilon, z, b, g, h, db, ...
dg, dh, qinv, rinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
\|R_k^{-1/2}( z_k - h_k - H_k * x_k )^\R{T}\|_1
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Note that
g_1
is the initial state estimate
and
Q_1
is the corresponding covariance.
\[
\begin{array}{ll}
{\rm minimize}
& \frac{1}{2} y^\R{T} H(x) y + d(x)^\R{T} y
+ \sqrt{\B{2}}^\R{T} (p^+ + p^-)
\;{\rm w.r.t} \; y \in \B{R}^{nN}\; , \; p^+ \in \B{R}_+^{M} \; , \; p^- \in \B{R}_+^{M}
\\
{\rm subject \; to} & b(x) + B(x) y - p^+ + p^- = 0
\end{array}
\]
q
, the lagrange multiplier
associated to the non-negativity constraint on
p^+
by
r
, and the lagrange multiplier associated to the
non-negativity of
p^-
by
s
. We also use
\B{1}
to denote the vector of length
mN
with all its components
equal to one, and
\B{\sqrt{2}}
to denote the vector of
length
mN
with all its components equal to
\sqrt{2}
.
The Lagrangian corresponding to the L1 Kalman-Bucy problem is
given by
\[
L(p^+, p^-, y, q, r, s) =
\frac{1}{2} y^\R{T} H y + d^\R{T} y + \B{\sqrt{2}}^T(p^+ + p^-)
+ q^\R{T} (b + B y - p^+ + p^-) - r^\R{T} p^+ - s^\R{T} p^- \;.
\]
The partial gradients of the Lagrangian are given by
\[
\nabla L :=
\begin{array}{rcl}
\nabla_p^+ L(p^+, p^-, y, q ) & = & \B{\sqrt{2}} - q - r \\
\nabla_p^- L(p^+, p^-, y, q) & = & \B{\sqrt{2}} + q - s \\
\nabla_y L(p^+, p^-, y, q ) & = & H y + c + B^\R{T} q \\
\nabla_q L(p^+, p^-, y, q ) & = & b + B y - p^+ + p^- \\
\end{array}
\]
At optimality, all of these gradients will be zero
(or smaller than a tolerance
\varepsilon
), and
complementarity conditions hold, i.e.
\[ r_i p_i^+ = s_i p_i^- = 0 \; \forall \; i \;.
\]
(or again, each pairwise product is smaller than
\varepsilon
.
max_itr
specifies the maximum number of
iterations of the algorithm to execute. It must be greater than or
equal to zero.
epsilon
specifies the convergence
criteria value; i.e.,
\[
\|\nabla L\|_1 \leq epsilon\;.
\]
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g
is a two dimensional array,
for
k = 1 , \ldots , N
\[
g_k = g(:, k)
\]
and
g
has size
n \times N
.
The value
g_1
serves as the initial state estimate.
h
is a two dimensional array,
for
k = 1 , \ldots , N
\[
h_k = h(:, k)
\]
and
h
has size
m \times N
.
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
The initial state estimate
g_1
does not depend on the value of
x_0
, hence
G_1
should be zero.
dh
is a three dimensional array,
for
k = 1 , \ldots , N
\[
H_k = dh(:,:,k)
\]
and
dh
has size
m \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:, k)
\]
and
qinv
has size
n \times n \times N
.
The value of
Q_k
is the variance of the initial state
estimate
g_1
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:, k)
\]
and
rinv
has size
m \times m \times N
.
It is ok to signify a missing data value by setting the corresponding
row and column of
rinv
to zero. This also enables use
to interpolate the state vector
x_k
to points where
there are no measurements.
xOut
contains the optimal sequence
( x_1 , \ldots , x_N )
.
For
k = 1 , \ldots , N
\[
x_k = xOut(:, k)
\]
and
xOut
is a two dimensional array with size
n \times N
.
rOut
contains the Lagrange multiplier sequence
corresponding to the non-negativity constraint on
p^+
.
For
k = 1 , \ldots , N
\[
r_k = rOut(:, k)
\]
and
rOut
is a two dimensional array with size
m \times N
.
rOut
contains the Lagrange multiplier sequence
corresponding to the non-negativity constraint on
p^-
.
For
k = 1 , \ldots , N
\[
s_k = sOut(:, k)
\]
and
sOut
is a two dimensional array with size
m \times N
.
pPlusOut
contains the positive parts
of the measurement residual
b(x) + B(x) y
.
For
k = 1 , \ldots , N
\[
p_k = pPlusOut(:, k)
\]
and
pPlusOut
is a two dimensional array with size
m \times N
.
pMinusOut
contains the negative parts
of the measurement residual
b(x) + B(x) y
.
For
k = 1 , \ldots , N
\[
p_k = pMinusOut(:, k)
\]
and
pMinusOut
is a two dimensional array with size
m \times N
.
info
is a matrix with each row corresponding
to an iteration of the algorithm.
There are five numbers output per iteration:
the infinity norm of the Kuhn-Tucker conditions,
the one-norm of the Kuhn-Tucker conditions, the duality gap,
\mu
, and the number of line search
Note that ckbs_L1_affine
has satisfied the convergence condition if
info(end, 2) <= epsilon
8.u: Example
ckbs_L1_affine
.
It returns true if ckbs_L1_affine
passes the test
and false otherwise.
[ok] = L1_affine_ok(draw_plot)
ok
is true, the test passed,
otherwise the test failed.
x1 (t)
| derivative of function we are estimating |
x2 (t)
| value of function we are estimating |
z1 (t)
value of
x2 (t)
plus noise
function [ok] = L1_affine_ok(draw_plot)
% --------------------------------------------------------
ok = true;
% You can change these parameters
conVar = 100; % variance of contaminating distribution
conFreq = .3; % frequency of outliers
noise = 'normal'; % type of noise
N = 30; % number of measurement time points
dt = 4*pi / N; % time between measurement points
%dt = 1;
gamma = 1; % transition covariance multiplier
sigma = 0.5; % standard deviation of measurement noise
if (strcmp(noise,'student'))
sigma = sqrt(2);
end
max_itr = 30; % maximum number of iterations
epsilon = 1e-5; % convergence criteria
h_min = 0; % minimum horizontal value in plots
h_max = 14; % maximum horizontal value in plots
v_min = -5.0; % minimum vertical value in plots
v_max = 5.0; % maximum vertical value in plots
% ---------------------------------------------------------
if nargin < 1
draw_plot = false;
end
% Define the problem
%rand('seed', 12);
%
% number of constraints per time point
% ell = 0;
%
% number of measurements per time point
m = 1;
%
% number of state vector components per time point
n = 2;
%
% simulate the true trajectory and measurement noise
t = (1 : N) * dt;
x1_true = - cos(t);
x2_true = - sin(t);
x_true = [ x1_true ; x2_true ];
% measurement values and model
%exp_true = random('exponential', 1/sigma, 1, N);
%bin = 2*random('binomial', 1, 0.5, 1, N) - 1;
%exp_true = exp_true .* bin;
%z = x2_true + exp_true;
% Normal Noise
v_true = zeros(1,N);
temp = zeros(1,N);
bin = zeros(1,N);
randn('state', sum(100*clock));
rand('twister', sum(100*clock));
if (strcmp(noise, 'normal'))
v_true = sigma * randn(1, N);
end
%Laplace Noise
if (strcmp(noise,'laplace'))
temp = sigma*exprnd(1/sigma, 1,100);
bin = 2*(binornd(1, 0.5, 1, 100)-0.5);
v_true = temp.*bin/sqrt(2);
end
if (strcmp(noise,'student'))
v_true = trnd(4,1,100);
end
temp = rand(1, N);
nI = temp < conFreq;
newNoise = sqrt(conVar)*randn(1,N);
z = x2_true + v_true + nI.*newNoise;
rk = sigma * sigma;
rinvk = 1 / rk;
rinv = zeros(m, m, N);
h = zeros(m, N);
% Covariance between process and measurement?
dh = zeros(m, n, N);
for k = 1 : N
rinv(:, :, k) = rinvk;
h(:, k) = 0;
dh(:,:, k) = [ 0 , 1 ];
end
%
% transition model
g = zeros(n, N);
dg = zeros(n, n, N);
qinv = zeros(n, n, N);
qk = gamma * [ dt , dt^2/2 ; dt^2/2 , dt^3/3 ];
qinvk = inv(qk);
for k = 2 : N
g(:, k) = 0;
dg(:,:, k) = [ 1 , 0 ; dt , 1 ];
qinv(:,:, k) = qinvk;
end
%
% initial state estimate
g(:, 1) = x_true(:, 1);
qinv(:,:, 1) = 100 * eye(2);
%
% constraints
b = zeros(0, N);
db = zeros(0, n, N);
% -------------------------------------------------------------------------
% Linear unconstrained Kalman-Bucy Smoother
[xOut, uOut, info] = ckbs_affine(max_itr, epsilon, z, b, g, h, db, dg, dh, qinv, rinv);
% -------------------------------------------------------------------------
% L1 Kalman-Bucy Smoother
[xOut2, rOut, sOut, pPlusOut, pMinusOut, info] = ckbs_L1_affine(max_itr, epsilon, z, g, h, dg, dh, qinv, rinv);
ok = (ok) && (info(end, 2) < epsilon);
% % --------------------------------------------------------------------------
%
if draw_plot
figure(1);
clf
hold on
plot(t', x_true(2,:)', '-k', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
meas = z(1,:);
meas(meas > v_max) = v_max;
meas(meas < v_min) = v_min;
plot(t', meas, 'bd', 'MarkerFaceColor', [.49 1 .63], 'MarkerSize', 6);
plot(t', xOut(2,:)', '-m', 'Linewidth', 2, 'MarkerEdgeColor', 'k' );
plot(t', xOut2(2,:)', '-r', 'Linewidth', 2, 'MarkerEdgeColor', 'k');
axis([h_min, h_max, v_min, v_max]);
title('Gaussian and L1 smoothers. Black=truth, Blue = meas, Magenta = Gaussian, Red = L1.');
hold off
%
% constrained estimate
end
end
ckbs_t_obj: 9.1 | Student's t Sum of Squares Objective |
ckbs_t_grad: 9.2 | Student's t Gradient |
ckbs_t_hess: 9.3 | Student's t Hessian |
ckbs_diag_solve: 9.4 | Block Diagonal Algorithm |
ckbs_bidiag_solve: 9.5 | Block Bidiagonal Algorithm |
ckbs_bidiag_solve_t: 9.6 | Block Bidiagonal Algorithm |
ckbs_blkbidiag_symm_mul: 9.7 | Packed Block Bidiagonal Matrix Symmetric Multiply |
ckbs_blkdiag_mul: 9.8 | Packed Block Diagonal Matrix Times a Vector or Matrix |
ckbs_blkdiag_mul_t: 9.9 | Transpose of Packed Block Diagonal Matrix Times a Vector or Matrix |
ckbs_blkbidiag_mul_t: 9.10 | Packed Lower Block Bidiagonal Matrix Transpose Times a Vector |
ckbs_blkbidiag_mul: 9.11 | Packed Lower Block Bidiagonal Matrix Times a Vector |
ckbs_blktridiag_mul: 9.12 | Packed Block Tridiagonal Matrix Times a Vector |
ckbs_sumsq_obj: 9.13 | Affine Residual Sum of Squares Objective |
ckbs_L2L1_obj: 9.14 | Affine Least Squares with Robust L1 Objective |
ckbs_sumsq_grad: 9.15 | Affine Residual Sum of Squares Gradient |
ckbs_process_grad: 9.16 | Affine Residual Process Sum of Squares Gradient |
ckbs_sumsq_hes: 9.17 | Affine Residual Sum of Squares Hessian |
ckbs_process_hes: 9.18 | Affine Process Residual Sum of Squares Hessian |
ckbs_tridiag_solve: 9.19 | Symmetric Block Tridiagonal Algorithm |
ckbs_tridiag_solve_b: 9.20 | Symmetric Block Tridiagonal Algorithm (Backward version) |
ckbs_tridiag_solve_pcg: 9.21 | Symmetric Block Tridiagonal Algorithm (Conjugate Gradient version) |
ckbs_newton_step: 9.22 | Affine Constrained Kalman Bucy Smoother Newton Step |
ckbs_newton_step_L1: 9.23 | Affine Robust L1 Kalman Bucy Smoother Newton Step |
ckbs_kuhn_tucker: 9.24 | Compute Residual in Kuhn-Tucker Conditions |
ckbs_kuhn_tucker_L1: 9.25 | Compute Residual in Kuhn-Tucker Conditions for Robust L1 |
[obj] = ckbs_t_obj(
x, z, g_fun, h_fun, qinv, rinv, params)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( z_k - h_k - H_k * x_k )^\R{T} * R_k^{-1} * ( z_k - h_k - H_k * x_k )
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
x
is a two dimensional array,
for
k = 1 , \ldots , N
\[
x_k = x(:, k)
\]
and
x
has size
n \times N
.
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g_fun
is a function handle to the process model.
h_fun
is a function handle to the measurement
model.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:, k)
\]
and
qinv
has size
n \times n \times N
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:, k)
\]
and
rinv
has size
m \times m \times N
.
obj
is a scalar given by
\[
obj = S ( x_1 , \ldots , x_N )
\]
ckbs_t_obj
.
It returns true if ckbs_t_obj
passes the test
and false otherwise.
function [ok] = t_obj_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 3; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
x = rand(n, N);
z = rand(m, N);
qinv = zeros(n, n, N);
rinv = zeros(m, m, N);
for k = 1 : N
tmp = rand(m, m);
rinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(m);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
params.direct_h_index = 1;
h_fun = @(k,x) direct_h(k,x,params);
params.pos_vel_g_dt = .05;
params.pos_vel_g_initial = zeros(n,1);
g_fun = @(k,x) pos_vel_g(k,x,params);
df_meas = 4;
df_proc = 4;
params.df_proc = df_proc;
params.df_meas = df_meas;
params.inds_proc_st = [];
params.inds_meas_st = 1;
% ---------------------------------------------------------
% Compute the Objective using ckbs_t_obj
obj = ckbs_t_obj(x, z, g_fun, h_fun, qinv, rinv, params);
% ---------------------------------------------------------
res_z = zeros(m, N);
for k = 1 : N
xk = x(:, k);
res_z(:,k) = z(:,k) - feval(h_fun, k, xk);
end
res_x = zeros(n, N);
xk = zeros(n, 1);
for k = 1 : N
xkm = xk;
xk = x(:, k);
gk = feval(g_fun, k, xkm);
res_x(:,k) = xk - gk;
end
obj2 = 0;
for k = 1 : N
qinv_k = qinv(:, :, k);
rinv_k = rinv(:, :, k);
res_x_k = res_x(:,k);
res_z_k = res_z(:,k);
obj2 = obj2 + 0.5*(df_meas)*log(1 + (1/df_meas)*res_z_k' * rinv_k * res_z_k) + ...
0.5*res_x_k' * qinv_k * res_x_k;
end
ok = ok & ( abs(obj - obj2) < 1e-10 );
return
end
[grad] = ckbs_t_grad(
x, z, g_fun, h_fun, qinv, rinv, params)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( z_k - h_k - H_k * x_k )^\R{T} * R_k^{-1} * ( z_k - h_k - H_k * x_k )
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Q_{N+1}
to be the
n \times n
identity
matrix and
G_{N+1}
to be zero,
\[
\begin{array}{rcl}
\nabla_k S_k^{(1)} ( x_k , x_{k-1} )
& = & H_k^\R{T} * R_k^{-1} * ( h_k + H_k * x_k - z_k )
+ Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
& = & G_{k+1}^\R{T} * Q_{k+1}^{-1} * ( g_{k+1} + G_{k+1} * x_k - x_{k+1} )
\end{array}
\]
It follows that the gradient of the
affine Kalman-Bucy smoother residual sum of squares is
\[
\begin{array}{rcl}
\nabla S ( x_1 , \ldots , x_N )
& = &
\left( \begin{array}{c}
d_1 \\ \vdots \\ d_N
\end{array} \right)
\\
d_k & = & \nabla_k S_k^{(1)} ( x_k , x_{k-1} )
+ \nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
\end{array}
\]
where
S_{N+1} ( x_{N+1} , x_N )
is defined as
identically zero.
x
is a two dimensional array,
for
k = 1 , \ldots , N
\[
x_k = x(:, k)
\]
and
x
has size
n \times N
.
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g_fun
is a function handle for the
process model.
h_fun
is a function handle for the
measurement model.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:,k)
\]
and
qinv
has size
n \times n \times N
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:,k)
\]
and
rinv
has size
m \times m \times N
.
params
is a structure containing the
requisite parameters.
grad
is a two dimensional array,
for
k = 1 , \ldots , N
\[
d_k = grad(:, k)
\]
and
grad
has size
n \times N
.
ckbs_t_grad
.
It returns true if ckbs_t_grad
passes the test
and false otherwise.
function [ok] = t_grad_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 1; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
% Define the problem
rand('seed', 123);
x = rand(n, N);
z = rand(m, N);
qinv = zeros(n, n, N);
rinv = zeros(m, m, N);
for k = 1 : N
tmp = rand(m, m);
rinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(m);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
params.direct_h_index = 1;
h_fun = @(k,x) direct_h(k,x,params);
params.pos_vel_g_dt = .01;
params.pos_vel_g_initial = zeros(n,1);
g_fun = @(k,x) pos_vel_g(k,x,params);
df_meas = 4;
df_proc = 4;
params.df_proc = df_proc;
params.df_meas = df_meas;
params.inds_proc_st = 1;
params.inds_meas_st = [];
% ---------------------------------------------------------
% Compute the gradient using ckbs_t_grad
g1 = ckbs_t_grad(x, z, g_fun, h_fun, qinv, rinv, params);
f1 = ckbs_t_obj(x, z, g_fun, h_fun, qinv, rinv, params);
eps = 1e-5;
diff = eps*randn(size(x));
x2 = x + diff;
g2 = ckbs_t_grad(x2, z, g_fun, h_fun, qinv, rinv, params);
f2 = ckbs_t_obj(x2, z, g_fun, h_fun, qinv, rinv, params);
temp = 0.5 *(g1(:) + g2(:))'*(diff(:))/(f2 - f1);
y = abs(1-temp)
ok = ok & (abs(y) < 1e-8);
return
end
[grad] = ckbs_t_hess(
x, z, g_fun, h_fun, qinv, rinv, params)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( z_k - h_k - H_k * x_k )^\R{T} * R_k^{-1} * ( z_k - h_k - H_k * x_k )
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Q_{N+1}
to be the
n \times n
identity
matrix and
G_{N+1}
to be zero,
\[
\begin{array}{rcl}
\nabla_k S_k^{(1)} ( x_k , x_{k-1} )
& = & H_k^\R{T} * R_k^{-1} * ( h_k + H_k * x_k - z_k )
+ Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
& = & G_{k+1}^\R{T} * Q_{k+1}^{-1} * ( g_{k+1} + G_{k+1} * x_k - x_{k+1} )
\end{array}
\]
It follows that the gradient of the
affine Kalman-Bucy smoother residual sum of squares is
\[
\begin{array}{rcl}
\nabla S ( x_1 , \ldots , x_N )
& = &
\left( \begin{array}{c}
d_1 \\ \vdots \\ d_N
\end{array} \right)
\\
d_k & = & \nabla_k S_k^{(1)} ( x_k , x_{k-1} )
+ \nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
\end{array}
\]
where
S_{N+1} ( x_{N+1} , x_N )
is defined as
identically zero.
x
is a two dimensional array,
for
k = 1 , \ldots , N
\[
x_k = x(:, k)
\]
and
x
has size
n \times N
.
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g_fun
is a function handle for the
process model.
h_fun
is a function handle for the
measurement model.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:,k)
\]
and
qinv
has size
n \times n \times N
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:,k)
\]
and
rinv
has size
m \times m \times N
.
params
is a structure containing the
requisite parameters.
D
is a three dimensional array,
for
k = 1 , \ldots , N
\[
D_k = hess(:, :, k)
\]
and
D
has size
n\times n \times N
.
A
is a three dimensional array,
for
k = 1 , \ldots , N
\[
A_k = hess(:,:,k)
\]
and
A
has size
n\times n \times N
.
ckbs_t_hess
.
It returns true if ckbs_t_hess
passes the test
and false otherwise.
function [ok] = t_hess_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 1; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
% Define the problem
rand('seed', 123);
x = rand(n, N);
z = rand(m, N);
qinv = zeros(n, n, N);
rinv = zeros(m, m, N);
for k = 1 : N
tmp = rand(m, m);
rinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(m);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
params.direct_h_index = 1;
h_fun = @(k,x) direct_h(k,x,params);
params.pos_vel_g_dt = .05;
params.pos_vel_g_initial = zeros(n,1);
g_fun = @(k,x) pos_vel_g(k,x,params);
df_meas = 4;
df_proc = 4;
params.df_proc = df_proc;
params.df_meas = df_meas;
params.inds_proc_st = 2;
params.inds_meas_st = 1;
% ---------------------------------------------------------
% Compute the Hessian using ckbs_t_hess
[D, A] = ckbs_t_hess(x, z, g_fun, h_fun, qinv, rinv, params);
% ---------------------------------------------------------
H = ckbs_blktridiag_mul(D, A, eye(n*N));
%
% Use finite differences to check Hessian
x = rand(n, N);
z = rand(m, N);
%
step = 1e-5;
for k = 1 : N
for i = 1 : n
% Check second partial w.r.t x(i, k)
xm = x;
xm(i, k) = xm(i, k) - step;
gradm = ckbs_t_grad(xm, z, g_fun, h_fun, qinv, rinv, params);
%
xp = x;
xp(i, k) = xp(i, k) + step;
gradp = ckbs_t_grad(xp, z, g_fun, h_fun, qinv, rinv, params);
%
check = (gradp - gradm) / ( 2 * step );
for k1 = 1 : N
for i1 = 1 : n
value = H(i + (k-1)*n, i1 + (k1-1)*n);
diff = check(i1, k1) - value
ok = ok & ( abs(diff) < 1e-10 );
end
end
end
end
return
end
[e, lambda] = ckbs_diag_solve(b, r)
e
:
\[
A * e = r
\]
where the diagonal matrix
A
is defined by
\[
A =
\left( \begin{array}{ccccc}
b_1 & 0 & 0 & \cdots \\
0 & b_2 & 0 & \\
\vdots & & \ddots & \\
0 & \cdots & 0 & b_N
\end{array} \right)
\]
b
is a three dimensional array,
for
k = 1 , \ldots , N
\[
b_k = b(:,:,k)
\]
and
b
has size
n \times n \times N
.
r
is an
(n * N) \times m
matrix.
e
is an
(n * N) \times m
matrix.
lambda
is a scalar equal to the
logarithm of the determinant of
A
.
ckbs_diag_solve
.
It returns true if ckbs_diag_solve
passes the test
and false otherwise.
function [ok] = diag_solve_ok()
ok = true;
m = 2;
n = 3;
N = 4;
% case where uk = 0, qk = I, and ak is random
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
for k = 1 : N
ak = a(:, :, k);
b(:, :, k) = 2 * eye(n) + ak * ak';
end
% ----------------------------------------
[e, lambda] = ckbs_diag_solve(b, r);
% ----------------------------------------
check = ckbs_blkdiag_mul(b, e);
ok = ok & max(max(abs(check - r))) < 1e-10;
return
end
[e, lambda] = ckbs_bidiag_solve(b, c, r)
e
:
\[
A * e = r
\]
where the bidiagonal matrix
A
is defined by
\[
A =
\left( \begin{array}{ccccc}
b_1 & 0 & 0 & \cdots \\
c_2 & b_2 & 0 & \\
\vdots & & \ddots & \\
0 & \cdots & c_N & b_N
\end{array} \right)
\]
b
is a three dimensional array,
for
k = 1 , \ldots , N
\[
b_k = b(:,:,k)
\]
and
b
has size
n \times n \times N
.
c
is a three dimensional array,
for
k = 2 , \ldots , N
\[
c_k = c(:,:,k)
\]
and
c
has size
n \times n \times N
.
r
is an
(n * N) \times m
matrix.
e
is an
(n * N) \times m
matrix.
lambda
is a scalar equal to the
logarithm of the determinant of
A
.
ckbs_bidiag_solve
.
It returns true if ckbs_bidiag_solve
passes the test
and false otherwise.
function [ok] = bidiag_solve_ok()
ok = true;
m = 2;
n = 3;
N = 4;
% case where uk = 0, qk = I, and ak is random
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
c = zeros(n, n, N);
for k = 1 : N
ak = a(:, :, k);
b(:, :, k) = 2 * eye(n) + ak * ak';
c(:, :, k) = ak';
end
% ----------------------------------------
[e, lambda] = ckbs_bidiag_solve(b, c, r);
% ----------------------------------------
check = ckbs_blkbidiag_mul(b, c, e);
ok = ok & max(max(abs(check - r))) < 1e-10;
return
end
[e, lambda] = ckbs_bidiag_solve_t(b, c, r)
e
:
\[
A^T * e = r
\]
where the bidiagonal matrix
A
is defined by
\[
A =
\left( \begin{array}{ccccc}
b_1 & 0 & 0 & \cdots \\
c_2 & b_2 & 0 & \\
\vdots & & \ddots & \\
0 & \cdots & c_N & b_N
\end{array} \right)
\]
b
is a three dimensional array,
for
k = 1 , \ldots , N
\[
b_k = b(:,:,k)
\]
and
b
has size
n \times n \times N
.
c
is a three dimensional array,
for
k = 2 , \ldots , N
\[
c_k = c(:,:,k)
\]
and
c
has size
n \times n \times N
.
r
is an
(n * N) \times m
matrix.
e
is an
(n * N) \times m
matrix.
lambda
is a scalar equal to the
logarithm of the determinant of
A
.
ckbs_bidiag_solve_t
.
It returns true if ckbs_bidiag_solve_t
passes the test
and false otherwise.
function [ok] = bidiag_solve_t_ok()
ok = true;
m = 2;
n = 3;
N = 4;
% case where uk = 0, qk = I, and ak is random
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
c = zeros(n, n, N);
for k = 1 : N
ak = a(:, :, k);
b(:, :, k) = 2 * eye(n) + ak * ak';
c(:, :, k) = ak';
end
% ----------------------------------------
[e, lambda] = ckbs_bidiag_solve_t(b, c, r);
% ----------------------------------------
check = ckbs_blkbidiag_mul_t(b, c, e);
ok = ok & max(max(abs(check - r))) < 1e-10;
return
end
[r, s] = ckbs_blkbidiag_symm_mul(Bdia, Boffdia, Ddia)
\[
W = B * D * B^T
\]
The actual output consists of a packed representation of the
diagonal and off-diagonal blocks of the matrix
W
.
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
Boffdia
is an
m \times n \times N
array.
For
k = 2 , \ldots , N
we define
C_k \in \B{R}^{m \times n}
by
\[
C_k = Boffdia(:, :, k)
\]
Ddia
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
D_k \in \B{R}^{n \times n}
by
\[
D_k = Ddia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
C_2^T & B_2 & \ddots & 0 \\
0 & \ddots & \ddots & 0 \\
& 0 & C_N^T & B_N
\end{array} \right)
\]
D
is defined by
\[
D
=
\left( \begin{array}{cccc}
D_1 & 0 & 0 & \\
0 & D_2 & \ddots & 0 \\
0 & \ddots & \ddots & 0 \\
& 0 & 0 & D_N
\end{array} \right)
\]
r
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
r_k \in \B{R}^{n \times n}
by
\[
r_k = r(:, :, k)
\]
% s
is an
m \times n \times N
array.
For
k = 2 , \ldots , N
we define
r_k \in \B{R}^{m \times n}
by
\[
s_k = s(:, :, k)
\]
ckbs_blkbidiag_symm_mul
.
It returns true if ckbs_blkbidiag_symm_mul
passes the test
and false otherwise.
function [ok] = blkbidiag_symm_mul_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 2;
n = 3;
N = 4;
% -------------------------------------------------------------
% Define the problem
rand('seed', 123);
rand('seed', 123);
Hdia = rand(n, n, N);
Hlow = rand(n, n, N);
Ddia = zeros(n, n, N);
for k = 1:N
dk = rand(n, n);
Ddia(:,:,k) = dk'*dk;
end
Hfull = ckbs_blkbidiag_mul(Hdia, Hlow, eye(n*N));
Dfull = ckbs_blkdiag_mul(Ddia, eye(n*N));
HDHfull = Hfull * Dfull * Hfull';
[r, s] = ckbs_blkbidiag_symm_mul(Hdia, Hlow, Ddia);
HDH = ckbs_blktridiag_mul(r, s, eye(n*N));
check = HDH - HDHfull;
ok = ok & ( max(max(abs(check))) < 1e-10 );
return
end
[w] = ckbs_blkdiag_mul(Bdia, v)
\[
W = B * V
\]
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
0 & B_2 & 0 & 0 \\
0 & 0 & \ddots & 0 \\
& 0 & 0 & B_N
\end{array} \right)
\]
V
is a matrix of size
n * N \times
p
, with no restrictions on
p
.
W
is a matrix of size
m * N \times p
.
ckbs_blkdiag_mul
.
It returns true if ckbs_blkdiag_mul
passes the test
and false otherwise.
function [ok] = blkdiag_mul_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 2;
n = 3;
N = 2;
k = 3;
% -------------------------------------------------------------
% Define the problem
rand('seed', 123);
v = rand(n * N, k);
Bdiag = zeros(m, n, N);
B = zeros(m * N , n * N);
blk_m = 1 : m;
blk_n = 1 : n;
for k = 1 : N
Bdiag(:, :, k) = rand(m, n);
B(blk_m, blk_n) = Bdiag(:, :, k);
blk_m = blk_m + m;
blk_n = blk_n + n;
end
% -------------------------------------
w = ckbs_blkdiag_mul(Bdiag, v);
% -------------------------------------
check = B * v;
ok = ok & ( max(max(abs(w - check))) < 1e-10 );
return
end
[w] = ckbs_blkdiag_mul_t(Bdia, v)
\[
W = B^\R{T} * V
\]
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
0 & B_2 & 0 & 0 \\
0 & 0 & \ddots & 0 \\
& 0 & 0 & B_N
\end{array} \right)
\]
V
is a matrix of size
m * N \times
p
, with no restrictions on
p
.
W
is a matrix of size
n * N \times p
.
ckbs_blkdiag_mul_t
.
It returns true if ckbs_blkdiag_mul_t
passes the test
and false otherwise.
function [ok] = blkdiag_mul_t_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 2;
n = 3;
N = 2;
p = 5;
% -------------------------------------------------------------
% Define the problem
rand('seed', 123);
v = rand(m * N, p);
Bdiag = zeros(m, n, N);
B = zeros(m * N , n * N);
blk_m = 1 : m;
blk_n = 1 : n;
for k = 1 : N
Bdiag(:, :, k) = rand(m, n);
B(blk_m, blk_n) = Bdiag(:, :, k);
blk_m = blk_m + m;
blk_n = blk_n + n;
end
% -------------------------------------
w = ckbs_blkdiag_mul_t(Bdiag, v);
% -------------------------------------
check = B' * v;
ok = ok & ( max(max(abs(w - check))) < 1e-10 );
return
end
[w] = ckbs_blkbidiag_mul_t(Bdia, Boffdia, v)
\[
W = B^T * V
\]
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
Boffdia
is an
m \times n \times N
array.
For
k = 2 , \ldots , N
we define
C_k \in \B{R}^{m \times n}
by
\[
C_k = Boffdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
C_2^T & 0 & \ddots & 0 \\
0 & \ddots & \ddots & 0 \\
& 0 & C_N^T & B_N
\end{array} \right)
\]
V
is a matrix of size
n * N \times
k
, with no restriction on
k
.
W
is a matrix of size
m * N \times k
.
ckbs_blkbidiag_mul_t
.
It returns true if ckbs_blkbidiag_mul_t
passes the test
and false otherwise.
function [ok] = blkbidiag_mul_t_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 2;
n = 3;
N = 4;
p = 3;
% -------------------------------------------------------------
% Define the problem
rand('seed', 123);
v = rand(n * N, 3);
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
Hdia = rand(n, n, N);
Hlow = rand(n, n, N);
Hcheck = zeros(n*N);
blk_n = 1 : n;
for k = 1 : N
Hcheck(blk_n, blk_n) = Hdia(:,:, k)';
blk_n = blk_n + n;
end
blk_n = 1:n;
for k = 2:N;
blk_minus = blk_n;
blk_n = blk_n + n;
Hcheck(blk_minus, blk_n) = Hlow(:,:, k)';
end
% -------------------------------------
H = ckbs_blkbidiag_mul_t(Hdia, Hlow, eye(n*N));
% -------------------------------------
check = H - Hcheck;
ok = ok & ( max(max(abs(check))) < 1e-10 );
return
end
[w] = ckbs_blkbidiag_mul(Bdia, Boffdia, v)
\[
W = B * V
\]
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
Boffdia
is an
m \times n \times N
array.
For
k = 2 , \ldots , N
we define
C_k \in \B{R}^{m \times n}
by
\[
C_k = Boffdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
C_2^T & 0 & \ddots & 0 \\
0 & \ddots & \ddots & 0 \\
& 0 & C_N^T & B_N
\end{array} \right)
\]
V
is a matrix of size
n * N \times
k
, with no restriction on
k
.
W
is a matrix of size
m * N \times k
.
ckbs_blkbidiag_mul
.
It returns true if ckbs_blkbidiag_mul
passes the test
and false otherwise.
function [ok] = blkbidiag_mul_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 2;
n = 3;
N = 4;
p = 3;
% -------------------------------------------------------------
% Define the problem
rand('seed', 123);
v = rand(n * N, 3);
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
Hdia = rand(n, n, N);
Hlow = rand(n, n, N);
Hcheck = zeros(n*N);
blk_n = 1 : n;
for k = 1 : N
Hcheck(blk_n, blk_n) = Hdia(:,:, k);
blk_n = blk_n + n;
end
blk_n = 1:n;
for k = 2:N;
blk_minus = blk_n;
blk_n = blk_n + n;
Hcheck(blk_n, blk_minus) = Hlow(:,:, k);
end
% -------------------------------------
H = ckbs_blkbidiag_mul(Hdia, Hlow, eye(n*N));
% -------------------------------------
check = H - Hcheck;
ok = ok & ( max(max(abs(check))) < 1e-10 );
return
end
[w] = ckbs_blktridiag_mul(Bdia, Boffdia, v)
\[
W = B * V
\]
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
Boffdia
is an
m \times n \times N
array.
For
k = 2 , \ldots , N
we define
C_k \in \B{R}^{m \times n}
by
\[
C_k = Boffdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & C_2 & 0 & \\
C_2^T & B_2 & \ddots & 0 \\
0 & \ddots & \ddots & C_N \\
& 0 & C_N^T & B_N
\end{array} \right)
\]
V
is a matrix of size
n * N \times
k
, with no restriction on
k
.
W
is a matrix of size
m * N \times k
.
ckbs_blktridiag_mul
.
It returns true if ckbs_blktridiag_mul
passes the test
and false otherwise.
function [ok] = blktridiag_mul_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 2;
n = 3;
N = 4;
p = 3;
% -------------------------------------------------------------
% Define the problem
rand('seed', 123);
v = rand(n * N, 3);
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
Hdia = rand(n, n, N);
Hlow = rand(n, n, N);
Hcheck = zeros(n*N);
blk_n = 1 : n;
for k = 1 : N
Hcheck(blk_n, blk_n) = Hdia(:,:, k);
blk_n = blk_n + n;
end
blk_n = 1:n;
for k = 2:N;
blk_minus = blk_n;
blk_n = blk_n + n;
Hcheck(blk_n, blk_minus) = Hlow(:,:, k);
Hcheck(blk_minus, blk_n) = Hlow(:,:, k)';
end
% -------------------------------------
H = ckbs_blktridiag_mul(Hdia, Hlow, eye(n*N));
% -------------------------------------
check = H - Hcheck;
ok = ok & ( max(max(abs(check))) < 1e-10 );
return
end
[obj] = ckbs_sumsq_obj(
x, z, g, h, dg, dh, qinv, rinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( z_k - h_k - H_k * x_k )^\R{T} * R_k^{-1} * ( z_k - h_k - H_k * x_k )
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
x
is a two dimensional array,
for
k = 1 , \ldots , N
\[
x_k = x(:, k)
\]
and
x
has size
n \times N
.
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g
is a two dimensional array,
for
k = 1 , \ldots , N
\[
g_k = g(:, k)
\]
and
g
has size
n \times N
.
h
is a two dimensional array,
for
k = 1 , \ldots , N
\[
h_k = h(:, k)
\]
and
h
has size
m \times N
.
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
dh
is a three dimensional array,
for
k = 1 , \ldots , N
\[
H_k = dh(:,:,k)
\]
and
dh
has size
m \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:, k)
\]
and
qinv
has size
n \times n \times N
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:, k)
\]
and
rinv
has size
m \times m \times N
.
obj
is a scalar given by
\[
obj = S ( x_1 , \ldots , x_N )
\]
ckbs_sumsq_obj
.
It returns true if ckbs_sumsq_obj
passes the test
and false otherwise.
function [ok] = sumsq_obj_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 3; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
x = rand(n, N);
z = rand(m, N);
h = rand(m, N);
g = rand(n, N);
dh = zeros(m, n, N);
dg = zeros(n, n, N);
qinv = zeros(n, n, N);
rinv = zeros(m, m, N);
for k = 1 : N
dh(:, :, k) = rand(m, n);
dg(:, :, k) = rand(n, n);
tmp = rand(m, m);
rinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(m);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
% ---------------------------------------------------------
% Compute the Objective using ckbs_sumsq_obj
obj = ckbs_sumsq_obj(x, z, g, h, dg, dh, qinv, rinv);
% ---------------------------------------------------------
sumsq = 0;
xk = zeros(n, 1);
for k = 1 : N
xkm = xk;
xk = x(:, k);
xres = xk - g(:, k) - dg(:,:, k) * xkm;
zres = z(:, k) - h(:, k) - dh(:,:, k) * xk;
sumsq = sumsq + xres' * qinv(:,:, k) * xres;
sumsq = sumsq + zres' * rinv(:,:, k) * zres;
end
ok = ok & ( abs(obj - sumsq/2) < 1e-10 );
return
end
[obj] = ckbs_L2L1_obj(
x, z, g, h, dg, dh, qinv, rinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
\|R_k^{-1/2}( z_k - h_k - H_k * x_k )^\R{T}\|_1
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
x
is a two dimensional array,
for
k = 1 , \ldots , N
,
\[
x_k = x(:, k)
\]
and
x
has size
n \times N
.
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g
is a two dimensional array,
for
k = 1 , \ldots , N
\[
g_k = g(:, k)
\]
and
g
has size
n \times N
.
h
is a two dimensional array,
for
k = 1 , \ldots , N
\[
h_k = h(:, k)
\]
and
h
has size
m \times N
.
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
dh
is a three dimensional array,
for
k = 1 , \ldots , N
\[
H_k = dh(:,:,k)
\]
and
dh
has size
m \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:, k)
\]
and
qinv
has size
n \times n \times N
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:, k)
\]
and
rinv
has size
m \times m \times N
.
obj
is a scalar given by
\[
obj = S ( x_1 , \ldots , x_N )
\]
ckbs_L2L1_obj
.
It returns true if ckbs_L2L1_obj
passes the test
and false otherwise.
function [ok] = L2L1_obj_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 3; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
x = rand(n, N);
z = rand(m, N);
h = rand(m, N);
g = rand(n, N);
dh = zeros(m, n, N);
dg = zeros(n, n, N);
qinv = zeros(n, n, N);
rinv = zeros(m, m, N);
for k = 1 : N
dh(:, :, k) = rand(m, n);
dg(:, :, k) = rand(n, n);
tmp = rand(m, m);
rinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(m);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
% ---------------------------------------------------------
% Compute the Objective using ckbs_L2L1_obj
obj = ckbs_L2L1_obj(x, z, g, h, dg, dh, qinv, rinv);
% ---------------------------------------------------------
L2L1 = 0;
xk = zeros(n, 1);
for k = 1 : N
xkm = xk;
xk = x(:, k);
xres = xk - g(:, k) - dg(:,:, k) * xkm;
zres = z(:, k) - h(:, k) - dh(:,:, k) * xk;
L2L1 = L2L1 + 0.5*norm(chol(qinv(:,:,k))*xres)^2;
L2L1 = L2L1 + sqrt(2)*norm(chol(rinv(:,:,k))*zres, 1);
end
ok = ok & ( abs(obj - L2L1) < 1e-10 );
return
end
[grad] = ckbs_sumsq_grad(
x, z, g, h, dg, dh, qinv, rinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( z_k - h_k - H_k * x_k )^\R{T} * R_k^{-1} * ( z_k - h_k - H_k * x_k )
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Q_{N+1}
to be the
n \times n
identity
matrix and
G_{N+1}
to be zero,
\[
\begin{array}{rcl}
\nabla_k S_k^{(1)} ( x_k , x_{k-1} )
& = & H_k^\R{T} * R_k^{-1} * ( h_k + H_k * x_k - z_k )
+ Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
& = & G_{k+1}^\R{T} * Q_{k+1}^{-1} * ( g_{k+1} + G_{k+1} * x_k - x_{k+1} )
\end{array}
\]
It follows that the gradient of the
affine Kalman-Bucy smoother residual sum of squares is
\[
\begin{array}{rcl}
\nabla S ( x_1 , \ldots , x_N )
& = &
\left( \begin{array}{c}
d_1 \\ \vdots \\ d_N
\end{array} \right)
\\
d_k & = & \nabla_k S_k^{(1)} ( x_k , x_{k-1} )
+ \nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
\end{array}
\]
where
S_{N+1} ( x_{N+1} , x_N )
is defined as
identically zero.
x
is a two dimensional array,
for
k = 1 , \ldots , N
\[
x_k = x(:, k)
\]
and
x
has size
n \times N
.
z
is a two dimensional array,
for
k = 1 , \ldots , N
\[
z_k = z(:, k)
\]
and
z
has size
m \times N
.
g
is a two dimensional array,
for
k = 1 , \ldots , N
\[
g_k = g(:, k)
\]
and
g
has size
n \times N
.
h
is a two dimensional array,
for
k = 1 , \ldots , N
\[
h_k = h(:, k)
\]
and
h
has size
m \times N
.
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
dh
is a three dimensional array,
for
k = 1 , \ldots , N
\[
H_k = dh(:,:,k)
\]
and
dh
has size
m \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:,k)
\]
and
qinv
has size
n \times n \times N
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:,k)
\]
and
rinv
has size
m \times m \times N
.
grad
is a two dimensional array,
for
k = 1 , \ldots , N
\[
d_k = grad(:, k)
\]
and
grad
has size
n \times N
.
ckbs_sumsq_grad
.
It returns true if ckbs_sumsq_grad
passes the test
and false otherwise.
function [ok] = sumsq_grad_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 3; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
x = rand(n, N);
z = rand(m, N);
h = rand(m, N);
g = rand(n, N);
dg = zeros(n, n, N);
dh = zeros(m, n, N);
qinv = zeros(n, n, N);
rinv = zeros(m, m, N);
for k = 1 : N
dh(:, :, k) = rand(m, n);
dg(:, :, k) = rand(n, n);
tmp = rand(m, m);
rinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(m);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
% ---------------------------------------------------------
% Compute the gradient using ckbs_sumsq_grad
grad = ckbs_sumsq_grad(x, z, g, h, dg, dh, qinv, rinv);
% ---------------------------------------------------------
% Use finite differences to check gradient
step = 1;
for k = 1 : N
for i = 1 : n
% Check second partial w.r.t x(i,k)
xm = x;
xm(i, k) = xm(i, k) - step;
Sm = ckbs_sumsq_obj(xm, z, g, h, dg, dh, qinv, rinv);
%
xp = x;
xp(i, k) = xp(i, k) + step;
Sp = ckbs_sumsq_obj(xp, z, g, h, dg, dh, qinv, rinv);
%
check = (Sp - Sm) / ( 2 * step);
diff = grad(i, k) - check;
ok = ok & ( abs(diff) < 1e-10 );
end
end
return
end
[grad] = ckbs_process_grad(
x, g, dg, qinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\end{array}
\]
where the matrix
Q_k
is
symmetric positive definite and
x_0
is the constant zero.
Q_{N+1}
to be the
n \times n
identity
matrix and
G_{N+1}
to be zero,
\[
\begin{array}{rcl}
\nabla_k S_k^{(1)} ( x_k , x_{k-1} )
& = & Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\\
\nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
& = & G_{k+1}^\R{T} * Q_{k+1}^{-1} * ( g_{k+1} + G_{k+1} * x_k - x_{k+1} )
\end{array}
\]
It follows that the gradient of the
affine Kalman-Bucy smoother residual sum of squares is
\[
\begin{array}{rcl}
\nabla S ( x_1 , \ldots , x_N )
& = &
\left( \begin{array}{c}
d_1 \\ \vdots \\ d_N
\end{array} \right)
\\
d_k & = & \nabla_k S_k^{(1)} ( x_k , x_{k-1} )
+ \nabla_k S_{k+1}^{(1)} ( x_{k+1} , x_k )
\end{array}
\]
where
S_{N+1} ( x_{N+1} , x_N )
is defined as
identically zero.
x
is a two dimensional array,
for
k = 1 , \ldots , N
\[
x_k = x(:, k)
\]
and
x
has size
n \times N
.
g
is a two dimensional array,
for
k = 1 , \ldots , N
\[
g_k = g(:, k)
\]
and
g
has size
n \times N
.
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:,k)
\]
and
qinv
has size
n \times n \times N
.
grad
is a two dimensional array,
for
k = 1 , \ldots , N
\[
d_k = grad(:, k)
\]
and
grad
has size
n \times N
.
ckbs_process_grad
.
It returns true if ckbs_process_grad
passes the test
and false otherwise.
function [ok] = process_grad_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 3; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
% Measurement terms are fixed at zero
z = zeros(m, N);
h = zeros(m, N);
dh = zeros(m, n, N);
rinv = zeros(m, m, N);
% Initialization of process terms
x = rand(n, N);
g = rand(n, N);
dg = zeros(n, n, N);
qinv = zeros(n, n, N);
for k = 1 : N
dg(:, :, k) = rand(n, n);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
% ---------------------------------------------------------
% Compute the gradient using ckbs_process_grad
grad = ckbs_process_grad(x, g,dg, qinv);
% ---------------------------------------------------------
% Use finite differences to check gradient
step = 1;
for k = 1 : N
for i = 1 : n
% Check second partial w.r.t x(i,k), setting measurement
% piece to 0
xm = x;
xm(i, k) = xm(i, k) - step;
Sm = ckbs_sumsq_obj(xm, z, g, h, dg, dh, qinv, rinv);
%
xp = x;
xp(i, k) = xp(i, k) + step;
Sp = ckbs_sumsq_obj(xp, z, g, h, dg, dh, qinv, rinv);
%
check = (Sp - Sm) / ( 2 * step);
diff = grad(i, k) - check;
ok = ok & ( abs(diff) < 1e-10 );
end
end
return
end
[D, A] = ckbs_sumsq_hes(dg, dh, qinv, rinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( z_k - h_k - H_k * x_k )^\R{T} * R_k^{-1} * ( z_k - h_k - H_k * x_k )
\\
& + &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\end{array}
\]
where the matrices
R_k
and
Q_k
are
symmetric positive definite and
x_0
is the constant zero.
Q_{N+1}
to be the
n \times n
identity
matrix and
G_{N+1}
to be zero,
the Hessian of the
affine Kalman-Bucy smoother residual sum of squares is
\[
\begin{array}{rcl}
S^{(2)} ( x_1 , \ldots , x_N ) & = &
\left( \begin{array}{cccc}
D_1 & A_2^\R{T} & 0 & \\
A_2 & D_2 & A_3^\R{T} & 0 \\
0 & \ddots & \ddots & \ddots \\
& 0 & A_N & D_N
\end{array} \right)
\\
D_k & = & H_k^\R{T} * R_k^{-1} * H_k + Q_k^{-1}
+ G_{k+1}^\R{T} * Q_{k+1}^{-1} * G_{k+1}
\\
A_k & = & - Q_k^{-1} * G_k
\end{array}
\]
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
dh
is a three dimensional array,
for
k = 1 , \ldots , N
\[
H_k = dh(:,:,k)
\]
and
dh
has size
m \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:,k)
\]
and
qinv
has size
n \times n \times N
.
rinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
R_k^{-1} = rinv(:,:,k)
\]
and
rinv
has size
m \times m \times N
.
D
is a three dimensional array,
for
k = 1 , \ldots , N
\[
D_k = D(:,:,k)
\]
and
D
has size
n \times n \times N
.
A
is a three dimensional array,
for
k = 2 , \ldots , N
\[
A_k = A(:,:,k)
\]
and
A
has size
n \times n \times N
.
ckbs_sumsq_hes
.
It returns true if ckbs_sumsq_hes
passes the test
and false otherwise.
function [ok] = sumsq_hes_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 3; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
dg = zeros(n, n, N);
dh = zeros(m, n, N);
qinv = zeros(n, n, N);
rinv = zeros(m, m, N);
for k = 1 : N
dh(:, :, k) = rand(m, n);
dg(:, :, k) = rand(n, n);
tmp = rand(m, m);
rinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(m);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
% ---------------------------------------------------------
% Compute the Hessian using ckbs_sumsq_hes
[D, A] = ckbs_sumsq_hes(dg, dh, qinv, rinv);
% ---------------------------------------------------------
H = zeros(n * N , n * N );
for k = 1 : N
index = (k - 1) * n + (1 : n);
H(index, index) = D(:, :, k);
if k > 1
H(index - n, index) = A(:, :, k)';
H(index, index - n) = A(:, :, k);
end
end
%
% Use finite differences to check Hessian
x = rand(n, N);
z = rand(m, N);
h = rand(m, N);
g = rand(n, N);
%
step = 1;
for k = 1 : N
for i = 1 : n
% Check second partial w.r.t x(i, k)
xm = x;
xm(i, k) = xm(i, k) - step;
gradm = ckbs_sumsq_grad(xm, z, g, h, dg, dh, qinv, rinv);
%
xp = x;
xp(i, k) = xp(i, k) + step;
gradp = ckbs_sumsq_grad(xp, z, g, h, dg, dh, qinv, rinv);
%
check = (gradp - gradm) / ( 2 * step );
for k1 = 1 : N
for i1 = 1 : n
value = H(i + (k-1)*n, i1 + (k1-1)*n);
diff = check(i1, k1) - value;
ok = ok & ( abs(diff) < 1e-10 );
end
end
end
end
return
end
[D, A] = ckbs_process_hes(dg, qinv)
\[
\begin{array}{rcl}
S ( x_1 , \ldots , x_N ) & = & \sum_{k=1}^N S_k ( x_k , x_{k-1} ) \\
S_k ( x_k , x_{k-1} ) & = &
\frac{1}{2}
( x_k - g_k - G_k * x_{k-1} )^\R{T} * Q_k^{-1} * ( x_k - g_k - G_k * x_{k-1} )
\end{array}
\]
where the matrix
Q_k
is
symmetric positive definite and
x_0
is the constant zero.
Q_{N+1}
to be the
n \times n
identity
matrix and
G_{N+1}
to be zero,
the Hessian of the
affine Kalman-Bucy smoother process residual sum of squares is
\[
\begin{array}{rcl}
S^{(2)} ( x_1 , \ldots , x_N ) & = &
\left( \begin{array}{cccc}
D_1 & A_2^\R{T} & 0 & \\
A_2 & D_2 & A_3^\R{T} & 0 \\
0 & \ddots & \ddots & \ddots \\
& 0 & A_N & D_N
\end{array} \right)
\\
D_k & = & Q_k^{-1} + G_{k+1}^\R{T} * Q_{k+1}^{-1} * G_{k+1}
\\
A_k & = & - Q_k^{-1} * G_k
\end{array}
\]
dg
is a three dimensional array,
for
k = 1 , \ldots , N
\[
G_k = dg(:,:,k)
\]
and
dg
has size
n \times n \times N
.
qinv
is a three dimensional array,
for
k = 1 , \ldots , N
\[
Q_k^{-1} = qinv(:,:,k)
\]
and
qinv
has size
n \times n \times N
.
D
is a three dimensional array,
for
k = 1 , \ldots , N
\[
D_k = D(:,:,k)
\]
and
D
has size
n \times n \times N
.
A
is a three dimensional array,
for
k = 2 , \ldots , N
\[
A_k = A(:,:,k)
\]
and
A
has size
n \times n \times N
.
ckbs_process_hes
.
It returns true if ckbs_process_hes
passes the test
and false otherwise.
function [ok] = process_hes_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 1; % number of measurements per time point
n = 2; % number of state vector components per time point
N = 3; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
dg = zeros(n, n, N);
qinv = zeros(n, n, N);
for k = 1 : N
dg(:, :, k) = rand(n, n);
tmp = rand(n, n);
qinv(:, :, k) = (tmp + tmp') / 2 + 2 * eye(n);
end
% ---------------------------------------------------------
% Compute the Hessian using ckbs_process_hes
[D, A] = ckbs_process_hes(dg, qinv);
% ---------------------------------------------------------
H = zeros(n * N , n * N );
for k = 1 : N
index = (k - 1) * n + (1 : n);
H(index, index) = D(:, :, k);
if k > 1
H(index - n, index) = A(:, :, k)';
H(index, index - n) = A(:, :, k);
end
end
%
% Use finite differences to check Hessian
x = rand(n, N);
g = rand(n, N);
%
step = 1;
for k = 1 : N
for i = 1 : n
% Check second partial w.r.t x(i, k)
xm = x;
xm(i, k) = xm(i, k) - step;
gradm = ckbs_process_grad(xm, g, dg, qinv);
%
xp = x;
xp(i, k) = xp(i, k) + step;
gradp = ckbs_process_grad(xp, g, dg, qinv);
%
check = (gradp - gradm) / ( 2 * step );
for k1 = 1 : N
for i1 = 1 : n
value = H(i + (k-1)*n, i1 + (k1-1)*n);
diff = check(i1, k1) - value;
ok = ok & ( abs(diff) < 1e-10 );
end
end
end
end
return
end
[e, lambda] = ckbs_tridiag_solve(b, c, r)
e
:
\[
A * e = r
\]
where the symmetric block tridiagonal matrix
A
is defined by
\[
A =
\left( \begin{array}{ccccc}
b_1 & c_2^\R{T} & 0 & \cdots & 0 \\
c_2 & b_2 & c_3^\R{T} & & \vdots \\
\vdots & & \ddots & & \\
0 & \cdots & & b_N & c_N
\end{array} \right)
\]
The routine 9.20: ckbs_tridiag_solve_b
solves the same problem.
The difference is that this routine runs the reduction starting
at the first block rather than the last block of the matrix,
which is less stable for this application.
b
is a three dimensional array,
for
k = 1 , \ldots , N
\[
b_k = b(:,:,k)
\]
and
b
has size
n \times n \times N
.
c
is a three dimensional array,
for
k = 2 , \ldots , N
\[
c_k = c(:,:,k)
\]
and
c
has size
n \times n \times N
.
r
is an
(n * N) \times m
matrix.
e
is an
(n * N) \times m
matrix.
lambda
is a scalar equal to the
logarithm of the determinant of
A
.
The marginal likelihood for parameters in a discrete
Gauss Markov process
,
Bradley M. Bell,
IEEE Transactions on Signal Processing,
Vol. 48,
No. 3,
March 2000.
k = 1 , \ldots , N
\[
\begin{array}{rcl}
b_k & = & u_k + q_{k-1}^{-1} + a_k * q_k^{-1} * a_k^\R{T} \\
c_k & = & q_{k-1}^{-1} * a_k^\R{T}
\end{array}
\]
where
u_k
is symmetric positive semi-definite and
q_k
is symmetric positive definite.
It follows that the algorithm used by ckbs_tridiag_solve
is well conditioned and will not try to invert singular matrices.
ckbs_tridiag_solve
.
It returns true if ckbs_tridiag_solve
passes the test
and false otherwise.
function [ok] = tridiag_solve_ok()
ok = true;
m = 2;
n = 3;
N = 4;
% case where uk = 0, qk = I, and ak is random
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
c = zeros(n, n, N);
for k = 1 : N
ak = a(:, :, k);
b(:, :, k) = 2 * eye(n) + ak * ak';
c(:, :, k) = ak';
end
% ----------------------------------------
[e, lambda] = ckbs_tridiag_solve(b, c, r);
% ----------------------------------------
%
check = ckbs_blktridiag_mul(b, c, e);
ok = ok & max(abs(check - r)) < 1e-10;
return
end
[e, lambda] = ckbs_tridiag_solve_b(b, c, r)
e
:
\[
A * e = r
\]
where the symmetric block tridiagonal matrix
A
is defined by
\[
A =
\left( \begin{array}{ccccc}
b_1 & c_2^\R{T} & 0 & \cdots & 0 \\
c_2 & b_2 & c_3^\R{T} & & \vdots \\
\vdots & & \ddots & & \\
0 & \cdots & & b_N & c_N
\end{array} \right)
\]
The routine 9.19: ckbs_tridiag_solve
solves the same problem.
The difference is that this routine runs the reduction starting
at the last block rather than the first block of the matrix,
which is more stable for this application.
b
is a three dimensional array,
for
k = 1 , \ldots , N
\[
b_k = b(:,:,k)
\]
and
b
has size
n \times n \times N
.
c
is a three dimensional array,
for
k = 2 , \ldots , N
\[
c_k = c(:,:,k)
\]
and
c
has size
n \times n \times N
.
r
is an
(n * N) \times m
matrix.
e
is an
(n * N) \times m
matrix.
lambda
is a scalar equal to the
logarithm of the determinant of
A
.
ckbs_tridiag_solve_b
.
It returns true if ckbs_tridiag_solve_b
passes the test
and false otherwise.
function [ok] = tridiag_solve_b_ok()
ok = true;
m = 2;
n = 3;
N = 4;
% case where uk = 0, qk = I, and ak is random
rand('seed', 123);
a = rand(n, n, N);
r = rand(n * N, m);
c = zeros(n, n, N);
for k = 1 : N
ak = a(:, :, k);
b(:, :, k) = 2 * eye(n) + ak * ak';
c(:, :, k) = ak';
end
% ----------------------------------------
[e, lambda] = ckbs_tridiag_solve_b(b, c, r);
% ----------------------------------------
check = ckbs_blktridiag_mul(b, c, e);
ok = ok & max(abs(check - r)) < 1e-10;
return
end
[e, iter] = ckbs_tridiag_solve_pcg(b, c, r)
e
:
\[
A * e = r
\]
where the symmetric block tridiagonal matrix
A
is defined by
\[
A =
\left( \begin{array}{ccccc}
b_1 & c_2^\R{T} & 0 & \cdots & 0 \\
c_2 & b_2 & c_3^\R{T} & & \vdots \\
\vdots & & \ddots & & \\
0 & \cdots & & b_N & c_N
\end{array} \right)
\]
The routines 9.19: ckbs_tridiag_solve
and 9.20: ckbs_tridiag_solve_b
solve the same problem, but only for one RHS. It is basically a wrapper
for Matlab's pcg routine.
b
is a three dimensional array,
for
k = 1 , \ldots , N
\[
b_k = b(:,:,k)
\]
and
b
has size
n \times n \times N
.
c
is a three dimensional array,
for
k = 2 , \ldots , N
\[
c_k = c(:,:,k)
\]
and
c
has size
n \times n \times N
.
in
is an
(n * N) \times 1
vector.
e
is an
(n * N) \times 1
vector.
iter
is a scalar equal to the
number of iterations that the cg method took to finish.
ckbs_tridiag_solve_cg
.
It returns true if ckbs_tridiag_solve_pcg
passes the test
and false otherwise.
function [ok] = tridiag_solve_pcg_ok()
ok = true;
m = 1;
n = 50;
N = 40;
eps = 1e-5;
% case where uk = 0, qk = I, and ak is random
%rand('seed', 123);
a = rand(n, n, N);
r = randn(n * N, m);
c = zeros(n, n, N);
for k = 1 : N
ak = a(:, :, k);
b(:, :, k) = 5 * eye(n) + ak * ak';
c(:, :, k) = ak';
end
% ----------------------------------------
[e, flag, relres, iter] = ckbs_tridiag_solve_pcg(b, c, r);
% ----------------------------------------
%
check = ckbs_blktridiag_mul(b, c, e);
max(abs(check - r))
ok = ok & max(abs(check - r))/max(abs(r)) < eps;
return
end
[ds, dy, du] = ckbs_newton_step(
mu, s, y, u, b, d, Bdia, Hdia, Hlow)
\mu
-relaxed affine constrained Kalman-Bucy smoother problem.
\mu \in \B{R}_+
,
H \in \B{R}^{p \times p}
,
d \in \B{R}^p
,
b \in \B{R}^r
, and
B \in \B{R}^{r \times p}
,
the
\mu
-relaxed affine constrained Kalman-Bucy smoother problem is:
\[
\begin{array}{rl}
{\rm minimize} & \frac{1}{2} y^\R{T} H y + d^\R{T} y
- \mu \sum_{i=1}^r \log(s_i)
\; {\rm w.r.t} \; y \in \B{R}^p \; , \; s \in \B{R}_+^r
\\
{\rm subject \; to} & s + b + B y = 0
\end{array}
\]
In addition,
H
is symmetric block tri-diagonal with each block of
size
n \times n
and
B
is block diagonal with each block of size
m \times n
(there is an integer
N
such that
p = n * N
and
r = m * N
).
u \in \B{R}^r
to denote the Lagrange multipliers corresponding to the constraint equation.
The corresponding Lagrangian is
\[
L(y, s, u) =
\frac{1}{2} y^\R{T} H y + d^\R{T} y
- \mu \sum_{i=1}^r \log(s_i)
+ u^\R{T} (s + b + B y)
\]
The partial gradients of the Lagrangian are given by
\[
\begin{array}{rcl}
\nabla_y L(y, s, u ) & = & H y + B^\R{T} u + d \\
\nabla_s L(y, s, u ) & = & u - \mu / s \\
\nabla_u L(y, s, u ) & = & s + b + B y \\
\end{array}
\]
where
\mu / s
is the component by component division of
\mu
by the components of the
s
.
Note, from the second equation, that we only need consider
u \geq 0
because
s \geq 0
.
D(s)
to denote the diagonal matrix with
s
along its diagonal and
1_r
to denote the vector, of length
r
with all its components
equal to one.
We define
F : \B{R}^{r + p + r} \rightarrow \B{R}^{r + p + r}
by
\[
F(s, u, y)
=
\left(
\begin{array}{c}
s + b + B y \\
D(s) D(u) 1_r - \mu 1_r\\
H y + B^\R{T} u + d
\end{array}
\right)
\]
The Kuhn-Tucker conditions for a solution of the
\mu
-relaxed constrained affine Kalman-Bucy smoother problem is
F(s, u, y) = 0
.
(s, u, y)
, the Newton step
( \Delta s^\R{T} , \Delta u^\R{T} , \Delta y^\R{T} )^\R{T}
solves the problem:
\[
F^{(1)} (s, u, y)
\left( \begin{array}{c} \Delta s \\ \Delta u \\ \Delta y \end{array} \right)
=
- F(s, u, y)
\]
mu
is a positive scalar specifying the
relaxation parameter
\mu
.
s
is a column vector of length
r
.
All the elements of
s
are greater than zero.
y
is a column vector of length
p
u
is a column vector of length
r
.
All the elements of
s
are greater than zero.
b
is a column vector of length
r
.
d
is a column vector of length
p
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
0 & B_2 & 0 & 0 \\
0 & 0 & \ddots & 0 \\
& 0 & 0 & B_N
\end{array} \right)
\]
Hdia
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
H_k \in \B{R}^{n \times n}
by
\[
H_k = Hdia(:, :, k)
\]
Hlow
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
L_k \in \B{R}^{n \times n}
by
\[
L_k = Hlow(:, :, k)
\]
H
is defined by
\[
H
=
\left( \begin{array}{cccc}
H_1 & L_2^\R{T} & 0 & \\
L_2 & H_2 & L_3^\R{T} & 0 \\
0 & \ddots & \ddots & \ddots \\
& 0 & L_N & H_N
\end{array} \right)
\]
ds
is a column vector of length
r
equal to the
\Delta s
components of the Newton step.
dy
is a column vector of length
p
equal to the
\Delta y
components of the Newton step.
du
is a column vector of length
r
equal to the
\Delta u
components of the Newton step.
ckbs_newton_step
.
It returns true if ckbs_newton_step
passes the test
and false otherwise.
F
is given by
\[
F^{(1)} (s, y, u) =
\left(
\begin{array}{ccccc}
D( 1_r ) & 0 & B \\
D( u ) & 0 & D(s) \\
0 & B^\R{T} & H
\end{array}
\right)
\]
It follows that
\[
\left(\begin{array}{ccc}
I & 0 & B \\
U & S & 0\\
0 & B^T & C \\
\end{array}\right)
\left(\begin{array}{ccc}
\Delta s \\ \Delta y \\ \Delta u
\end{array}\right)
=
-
\left(\begin{array}{ccc}
s + b + By \\
SU\B{1} - \mu\B{1}\\
Cy + B^Tu + d
\end{array}\right)\;.
\]
Below,
r_i
refers to row
i
.
Applying the row operations
\[
\begin{array}{ccc}
r_2 &=& r_2 - U r_1 \\
r_3 &=& r_3 - B^T S^{-1} r_2
\end{array}\;,
\]
we obtain the equivalent system
\[
\left(\begin{array}{ccc}
I & 0 & B \\
0 & S & -UB\\
0 & 0 & C + B^T S^{-1} U B
\end{array}\right)
\left(\begin{array}{ccc}
\Delta s \\ \Delta y \\ \Delta u
\end{array}\right)
=
-
\left(\begin{array}{ccc}
s + b + B y \\
-U(b + B y) - \mu\B{1}\\
Cy + B^T u + d + B^T S^{-1}
\left(U(b + B y) + \mu \B{1}
\right)
\end{array}\right)\;.
\]
\Delta y
is obtained
from a single block tridiagonal solve (see third row of system).
Then we immediately have
\[
\Delta u = US^{-1}(b + B(y + \Delta y)) + \frac{\mu}{s}
\]
and
\[
\Delta s = -s -b -B(y + \Delta y)\;.
\]
.
function [ok] = newton_step_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 2; % number of measurements per time point
n = 3; % number of state vector components per time point
N = 4; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
mu = .5;
r = m * N;
p = n * N;
s = rand(r, 1) + 1;
y = rand(p, 1);
u = rand(r, 1) + 1;
b = rand(r, 1);
d = rand(p, 1);
H = zeros(p, p);
B = zeros(r, r);
Hdia = zeros(n, n, N);
Hlow = zeros(n, n, N);
Bdia = rand(m, n, N);
for k = N : -1 : 1
tmp = rand(n, n);
Hlow(:,:, k) = tmp;
Hdia(:,:, k) = (tmp * tmp') + 4 * eye(n);
end
% Form full matrices
B = ckbs_blkdiag_mul(Bdia, speye(p));
H = ckbs_blktridiag_mul(Hdia, Hlow, speye(p));
% -------------------------------------------------------------------
[ds, dy, du] = ckbs_newton_step(mu, s, y, u, b, d, Bdia, Hdia, Hlow);
% -------------------------------------------------------------------
F = ckbs_kuhn_tucker(mu, s, y, u, b, d, Bdia, Hdia, Hlow);
dF = [ ...
eye(r) , B , zeros(r, r)
zeros(p, r) , H , B'
diag(u) , zeros(r, p) , diag(s) ...
];
delta = [ ds ; dy ; du ];
ok = ok & ( max( abs( F + dF * delta ) ) <= 1e-10 );
return
end
[dPPlus, dPMinus, dR, dS, dY] = ckbs_newton_step_L1(
mu, s, y, r, b, d, Bdia, Hdia, Hlow,
pPlus, pMinus)
\mu
-relaxed affine robust L1 Kalman-Bucy smoother problem.
\mu \in \B{R}_+
,
s \in \B{R}^{m \times N}
,
y \in \B{R}^{n \times N}
,
r \in \B{R}^{m \times N}
,
b \in \B{R}^{m \times N}
,
d \in \B{R}^{n \times N}
,
B \in \B{R}^{m \times n \times N}
,
H \in \B{R}^{n \times n \times N}
,
p^+ \in \B{R}^{m \times N}
,
p^- \in \B{R}^{m \times N}
,
the
\mu
-relaxed affine L1 robust Kalman-Bucy smoother problem is:
\[
\begin{array}{ll}
{\rm minimize}
& \frac{1}{2} y^\R{T} H(x) y + d(x)^\R{T} y
+ \sqrt{\B{2}}^\R{T} (p^+ + p^-) -\mu \sum_{i =
1}^{mN} \log(p_i^+) - \sum_{i=1}^{mN} \mu \log(p_i^-)
\;{\rm w.r.t} \; y \in \B{R}^{nN}\; , \; p^+ \in \B{R}_+^{M} \; , \; p^- \in \B{R}_+^{M}
\\
{\rm subject \; to} & b(x) + B(x) y - p^+ + p^- = 0
\end{array}
\]
In addition,
H
is symmetric block tri-diagonal with each block of
size
n \times n
and
B
is block diagonal with each block of size
m \times n
r, \; s \in \B{R}^{m \times N}
to denote
\mu /p^+\;,\; \mu/p^-
, respectively, and
we denote by
q
the lagrange multiplier associated to the
equality constraint. We also use
\B{1}
to denote the vector of length
mN
with all its components
equal to one, and
\B{\sqrt{2}}
to denote the vector of
length
mN
with all its components equal to
\sqrt{2}
.
The corresponding Lagrangian is
\[
L(p^+, p^-, y, q) =
\frac{1}{2} y^\R{T} H y + d^\R{T} y + \B{\sqrt{2}}^T(p^+ + p^-)
- \mu \sum_{i=1}^{mN} \log(p_i^+) - \mu\sum_{i=1}^{mN}\log(p_i^-)
+ q^\R{T} (b + B y - p^+ + p^-)
\]
The partial gradients of the Lagrangian are given by
\[
\begin{array}{rcl}
\nabla_p^+ L(p^+, p^-, y, q ) & = & \B{\sqrt{2}} - q - r \\
\nabla_p^- L(p^+, p^-, y, q) & = & \B{\sqrt{2}} + q - s \\
\nabla_y L(p^+, p^-, y, q ) & = & H y + c + B^\R{T} q \\
\nabla_q L(p^+, p^-, y, q ) & = & b + B y - p^+ + p^- \\
\end{array}
\]
From the first two of the above equations,
we have
q = (r - s)/2
.
D(s)
to denote the diagonal matrix with
s
along its diagonal.
The Kuhn-Tucker Residual function
F : \B{R}^{4mN + nN} \rightarrow \B{R}^{4mN + nN}
is defined by
\[
F(p^+, p^-, r, s, y)
=
\left(
\begin{array}{c}
p^+ - p^- - b - B y \\
D(p^-) D(s) \B{1} - \tau \B{1} \\
r + s - 2 \B{\sqrt{2}} \\
D(p^+) D(r ) \B{1} - \tau \B{1} \\
H y + d + B^\R{T} (r - s)/2
\end{array}
\right)
\]
The Kuhn-Tucker conditions for a solution of the
\mu
-relaxed constrained affine Kalman-Bucy smoother problem is
F(p^+, p^-, r, s, y) = 0
.
(p^+, p^-, r, s, y)
, the Newton step
( (\Delta p^+)^\R{T} , (\Delta p^-)^\R{T} , \Delta
r^\R{T}, \Delta s^\R{T}, \Delta y^\R{T} )^\R{T}
solves the problem:
\[
F_\mu^{(1)} (p^+, p^-, r, s, y)
\left( \begin{array}{c} \Delta p^+ \\ \Delta p^- \\ \Delta r \\
\Delta s \\ \Delta y \end{array} \right)
=
- F(p^+, p^-, r, s, y)
\]
mu
is a positive scalar specifying the
relaxation parameter
\mu
.
s
is an array of size
m \times
N
.
All the elements of
s
are greater than zero.
y
is an array of size
n \times
N
r
is an array of size
m \times
N
. All the elements of
r
are greater than zero.
b
is an array of size
m \times N
.
d
is an array of size
n \times
N
.
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
0 & B_2 & 0 & 0 \\
0 & 0 & \ddots & 0 \\
& 0 & 0 & B_N
\end{array} \right)
\]
Hdia
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
H_k \in \B{R}^{n \times n}
by
\[
H_k = Hdia(:, :, k)
\]
Hlow
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
L_k \in \B{R}^{n \times n}
by
\[
L_k = Hlow(:, :, k)
\]
H
is defined by
\[
H
=
\left( \begin{array}{cccc}
H_1 & L_2^\R{T} & 0 & \\
L_2 & H_2 & L_3^\R{T} & 0 \\
0 & \ddots & \ddots & \ddots \\
& 0 & L_N & H_N
\end{array} \right)
\]
dPPlus
is an array of size
m \times N
equal to the
\Delta p^+
components of the Newton step.
dPMinus
is an array of size
m \times N
equal to the
\Delta p^-
components of the Newton step.
dR
is an array of size
m \times N
equal to the
\Delta r
components of the Newton step.
dS
is an array of size
m \times N
equal to the
\Delta s
components of the Newton step.
dY
is an array of size
n \times N
equal to the
\Delta y
components of the Newton step.
ckbs_newton_step_L1
.
It returns true if ckbs_newton_step_L1
passes the test
and false otherwise.
F
is given by
\[
F_\mu^{(1)} (p^+, p^-, r, s, y) =
\left(
\begin{array}{ccccccc}
I & -I & 0 & 0 & -B \\
0 & D( s^- ) & 0 & D(p^-) & 0 \\
0 & 0 & I & I & 0 \\
D( s^+) & 0 & D( p^+ ) & 0 & 0 \\
0 & 0 & - 0.5 B^\R{T} & 0.5 B^\R{T} & C
\end{array}
\right)
\]
Given the inputs
p^+ , p^-, s^+ , s^- , y , B , b , C
and
c
,
the following algorithm solves the Newton System for
\Delta p^+ , \Delta p^- , \Delta s^+ , \Delta s^-
,
and
\Delta y
:
\[
\begin{array}{cccccc}
\bar{d} &= & \tau \B{1} / s^+ - \tau \B{1} / s^- - b - B y + p^+
\\
\bar{e} &= & B^\R{T} ( \sqrt{\B{2}} - s^- ) - C y - c
\\
\bar{f} &= & \bar{d} - D( s^+ )^{-1} D( p^+ ) ( 2 \sqrt{\B{2}} - s^- )
\\
T &= & D( s^+ )^{-1} D( p^+ ) + D( s^- )^{-1} D( p^- )
\\
\Delta y &= &
[ C + B^\R{T} T^{-1} B ]^{-1} ( \bar{e} + B^\R{T} T^{-1} \bar{f} )
\\
\Delta s^- &= &
T^{-1} B \Delta y - T^{-1} \bar{f}
\\
\Delta s^+ &= &
- \Delta s^- + 2 \sqrt{\B{2}} - s^+ - s^-
\\
\Delta p^- &= &
D( s^- )^{-1} [ \tau \B{1} - D( p^- ) \Delta s^- ] - p^-
\\
\Delta p^+ &= &
\Delta p^- + B \Delta y + b + B y - p^+ + p^-
\end{array}
\]
where the matrix
T
is diagonal with positive elements,
and the matrix
C + B^\R{T} T^{-1} B \in \B{R}^{N n \times N n }
is
block tridiagonal positive definite.
function [ok] = newton_step_L1_ok()
ok = true;
% --------------------------------------------------------
% You can change these parameters
m = 2; % number of measurements per time point
n = 3; % number of state vector components per time point
N = 4; % number of time points
% ---------------------------------------------------------
% Define the problem
rand('seed', 123);
mu = .5;
s = rand(m, N);
y = rand(n, N);
r = rand(m, N);
b = rand(m, N);
d = rand(n, N);
Hdia = zeros(n, n, N);
Hlow = zeros(n, n, N);
Bdia = zeros(m, n, N);
pPlus = rand(m, N);
pMinus = rand(m, N);
for k = N : -1 : 1
Bdia(:,:, k) = rand(m, n);
tmp = rand(n, n);
Hlow(:,:, k) = tmp;
Hdia(:,:, k) = (tmp * tmp') + 4 * eye(n);
%
end
% Form full matrices.
H = ckbs_blktridiag_mul(Hdia, Hlow, speye(n*N));
B = ckbs_blkdiag_mul(Bdia, speye(n*N));
% -------------------------------------------------------------------
[dPPlus, dPMinus, dR, dS, dY] = ckbs_newton_step_L1(mu, s, y, r, b, d, Bdia, Hdia, Hlow, pPlus, pMinus);
% -------------------------------------------------------------------
F =ckbs_kuhn_tucker_L1(mu, s, y, r, b, d, Bdia, Hdia, Hlow, ...
pPlus, pMinus);
dF = [ ...
speye(m*N) , -speye(m*N), zeros(m*N), zeros(m*N), -B
zeros(m*N) , diag(s(:)) , zeros(m*N), diag(pMinus(:)),zeros(m*N,n*N)
zeros(m*N) , zeros(m*N) , speye(m*N), speye(m*N),zeros(m*N,n*N)
diag(r(:)) , zeros(m*N) , diag(pPlus(:)), zeros(m*N),zeros(m*N,n*N)
zeros(n*N, m*N), zeros(n*N, m*N), -B'/2, B'/2, H
];
delta = [ dPPlus(:); dPMinus(:); dR(:); dS(:); dY(:) ];
ok = ok & ( max( abs( F + dF * delta ) ) <= 1e-10 );
return
end
[F] = ckbs_kuhn_tucker(
mu, s, y, u, b, d, Bdia, Hdia, Hlow)
\mu
-relaxed affine constrained Kalman-Bucy smoother problem.
\mu \in \B{R}_+
,
H \in \B{R}^{p \times p}
,
d \in \B{R}^p
,
b \in \B{R}^r
, and
B \in \B{R}^{r \times p}
,
the
\mu
-relaxed affine constrained Kalman-Bucy smoother problem is:
\[
\begin{array}{rl}
{\rm minimize} & \frac{1}{2} y^\R{T} H y + d^\R{T} y
- \mu \sum_{i=1}^r \log(s_i)
\; {\rm w.r.t} \; y \in \B{R}^p \; , \; s \in \B{R}_+^r
\\
{\rm subject \; to} & s + b + B y = 0
\end{array}
\]
In addition,
H
is symmetric block tri-diagonal with each block of
size
n \times n
and
B
is block diagonal with each block of size
m \times n
(there is an integer
N
such that
p = n * N
and
r = m * N
).
u \in \B{R}^r
to denote the Lagrange multipliers corresponding to the constraint equation.
The corresponding Lagrangian is
\[
L(y, s, u) =
\frac{1}{2} y^\R{T} H y + d^\R{T} y
- \mu \sum_{i=1}^r \log(s_i)
+ u^\R{T} (s + b + B y)
\]
The partial gradients of the Lagrangian are given by
\[
\begin{array}{rcl}
\nabla_y L(y, s, u ) & = & H y + B^\R{T} u + d \\
\nabla_s L(y, s, u ) & = & u - \mu / s \\
\nabla_u L(y, s, u ) & = & s + b + B y \\
\end{array}
\]
where
\mu / s
is the component by component division of
\mu
by the components of the
s
.
Note, from the second equation, that we only need consider
u \geq 0
because
s \geq 0
.
D(s)
to denote the diagonal matrix with
s
along its diagonal and
1_r
to denote the vector, of length
r
with all its components
equal to one.
The Kuhn-Tucker Residual function
F : \B{R}^{r + p + r} \rightarrow \B{R}^{r + p + r}
is defined by
\[
F(s, y, u)
=
\left(
\begin{array}{c}
s + b + B y \\
H y + B^\R{T} u + d \\
D(s) D(u) 1_r - \mu 1_r
\end{array}
\right)
\]
The Kuhn-Tucker conditions for a solution of the
\mu
-relaxed constrained affine Kalman-Bucy smoother problem is
F(s, y, u) = 0
.
mu
is a positive scalar specifying the
relaxation parameter
\mu
.
s
is a column vector of length
r
.
All the elements of
s
are greater than zero.
y
is a column vector of length
p
u
is a column vector of length
r
.
All the elements of
s
are greater than zero.
b
is a column vector of length
r
.
d
is a column vector of length
p
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
0 & B_2 & 0 & 0 \\
0 & 0 & \ddots & 0 \\
& 0 & 0 & B_N
\end{array} \right)
\]
Hdia
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
H_k \in \B{R}^{n \times n}
by
\[
H_k = Hdia(:, :, k)
\]
Hlow
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
L_k \in \B{R}^{n \times n}
by
\[
L_k = Hlow(:, :, k)
\]
H
is defined by
\[
H
=
\left( \begin{array}{cccc}
H_1 & L_2^\R{T} & 0 & \\
L_2 & H_2 & L_3^\R{T} & 0 \\
0 & \ddots & \ddots & \ddots \\
& 0 & L_N & H_N
\end{array} \right)
\]
F
is a column vector of length
r + p + r
containing the value of the
9.24.e: Kuhn-Tucker residual
; i.e.,
F(s, y, u)
.
ckbs_kuhn_tucker
.
It returns true if ckbs_kuhn_tucker
passes the test
and false otherwise.
function [ok] = kuhn_tucker_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 5;
n = 4;
N = 3;
% -------------------------------------------------------------
rand('seed', 123);
p = n * N;
r = m * N;
mu = 1.5;
s = rand(r, 1);
y = rand(p, 1);
u = rand(r, 1);
b = rand(r, 1);
d = rand(p, 1);
Bdia = rand(m, n, N);
Hdia = rand(n, n, N);
Hlow = rand(n, n, N);
By = ckbs_blkdiag_mul(Bdia, y);
Btu = ckbs_blkdiag_mul_t(Bdia, u);
Hy = ckbs_blktridiag_mul(Hdia, Hlow, y);
% -------------------------------------
F = ckbs_kuhn_tucker(mu, s, y, u, b, d, Bdia, Hdia, Hlow);
% -------------------------------------
check = [ ...
s + b + By; ...
Hy + Btu + d; ...
s .* u - mu ...
];
ok = ok & ( max(abs(F - check)) < 1e-10 );
return
end
[F] = ckbs_kuhn_tucker_L1(
mu, s, y, r, b, d, Bdia, Hdia, Hlow,
pPlus, pMinus)
\mu
-relaxed affine L1 robust smoother problem.
\mu \in \B{R}_+
,
s \in \B{R}^{m \times N}
,
y \in \B{R}^{n \times N}
,
r \in \B{R}^{m \times N}
,
b \in \B{R}^{m \times N}
,
d \in \B{R}^{n \times N}
,
B \in \B{R}^{m \times n \times N}
,
H \in \B{R}^{n \times n \times N}
,
p^+ \in \B{R}^{m \times N}
,
p^- \in \B{R}^{m \times N}
,
the
\mu
-relaxed affine L1 robust Kalman-Bucy smoother problem is:
\[
\begin{array}{ll}
{\rm minimize}
& \frac{1}{2} y^\R{T} H(x) y + d(x)^\R{T} y
+ \sqrt{\B{2}}^\R{T} (p^+ + p^-) -\mu \sum_{i =
1}^{mN} \log(p_i^+) - \sum_{i=1}^{mN} \mu \log(p_i^-)
\;{\rm w.r.t} \; y \in \B{R}^{nN}\; , \; p^+ \in \B{R}_+^{M} \; , \; p^- \in \B{R}_+^{M}
\\
{\rm subject \; to} & b(x) + B(x) y - p^+ + p^- = 0
\end{array}
\]
In addition,
H
is symmetric block tri-diagonal with each block of
size
n \times n
and
B
is block diagonal with each block of size
m \times n
r, \; s \in \B{R}^{m \times N}
to denote
\mu /p^+\;,\; \mu/p^-
, respectively, and
we denote by
q
the lagrange multiplier associated to the
equality constraint. We also use
\B{1}
to denote the vector of length
mN
with all its components
equal to one, and
\B{\sqrt{2}}
to denote the vector of
length
mN
with all its components equal to
\sqrt{2}
.
The corresponding Lagrangian is
\[
L(p^+, p^-, y, q) =
\frac{1}{2} y^\R{T} H y + d^\R{T} y + \B{\sqrt{2}}^T(p^+ + p^-)
- \mu \sum_{i=1}^{mN} \log(p_i^+) - \mu\sum_{i=1}^{mN}\log(p_i^-)
+ q^\R{T} (b + B y - p^+ + p^-)
\]
The partial gradients of the Lagrangian are given by
\[
\begin{array}{rcl}
\nabla_p^+ L(p^+, p^-, y, q ) & = & \B{\sqrt{2}} - q - r \\
\nabla_p^- L(p^+, p^-, y, q) & = & \B{\sqrt{2}} + q - s \\
\nabla_y L(p^+, p^-, y, q ) & = & H y + c + B^\R{T} q \\
\nabla_q L(p^+, p^-, y, q ) & = & b + B y - p^+ + p^- \\
\end{array}
\]
From the first two of the above equations,
we have
q = (r - s)/2
.
D(s)
to denote the diagonal matrix with
s
along its diagonal.
The Kuhn-Tucker Residual function
F : \B{R}^{4mN + nN} \rightarrow \B{R}^{4mN + nN}
is defined by
\[
F(p^+, p^-, r, s, y)
=
\left(
\begin{array}{c}
p^+ - p^- - b - B y \\
D(p^-) D(s) \B{1} - \tau \B{1} \\
r + s - 2 \B{\sqrt{2}} \\
D(p^+) D(r ) \B{1} - \tau \B{1} \\
H y + d + B^\R{T} (r - s)/2
\end{array}
\right)
\]
The Kuhn-Tucker conditions for a solution of the
\mu
-relaxed constrained affine Kalman-Bucy smoother problem is
F(p^+, p^-, r, s, y) = 0
; see Equation (13) in
13.a: Aravkin et al 2009
mu
is a positive scalar specifying the
relaxation parameter
\mu
.
s
is an array of size
m \times
N
.
All the elements of
s
are greater than zero.
y
is an array of size
n \times
N
r
is an array of size
m \times
N
. All the elements of
r
are greater than zero.
b
is an array of size
m \times N
.
d
is an array of size
n \times
N
.
Bdia
is an
m \times n \times N
array.
For
k = 1 , \ldots , N
we define
B_k \in \B{R}^{m \times n}
by
\[
B_k = Bdia(:, :, k)
\]
B
is defined by
\[
B
=
\left( \begin{array}{cccc}
B_1 & 0 & 0 & \\
0 & B_2 & 0 & 0 \\
0 & 0 & \ddots & 0 \\
& 0 & 0 & B_N
\end{array} \right)
\]
Hdia
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
H_k \in \B{R}^{n \times n}
by
\[
H_k = Hdia(:, :, k)
\]
Hlow
is an
n \times n \times N
array.
For
k = 1 , \ldots , N
we define
L_k \in \B{R}^{n \times n}
by
\[
L_k = Hlow(:, :, k)
\]
H
is defined by
\[
H
=
\left( \begin{array}{cccc}
H_1 & L_2^\R{T} & 0 & \\
L_2 & H_2 & L_3^\R{T} & 0 \\
0 & \ddots & \ddots & \ddots \\
& 0 & L_N & H_N
\end{array} \right)
\]
F
is a column vector of length
4mN + nN
containing the value of the
9.25: ckbs_kuhn_tucker_L1
; Kuhn-Tucker L1 residual, i.e.,
F(p^+, p^-, s^+, s^-, y)
.
ckbs_kuhn_tucker_L1
.
It returns true if ckbs_kuhn_tucker_L1
passes the test
and false otherwise.
function [ok] = kuhn_tucker_L1_ok()
ok = true;
% -------------------------------------------------------------
% You can change these parameters
m = 5;
n = 4;
N = 3;
% -------------------------------------------------------------
rand('seed', 123);
%r = m * N;
mu = 1.5;
s = rand(m, N);
y = rand(n, N);
r = rand(m, N);
b = rand(m, N);
d = rand(n, N);
Bdia = rand(m, n, N);
Hdia = rand(n, n, N);
Hlow = rand(n, n, N);
pPlus = rand(m, N);
pMinus = rand(m, N);
% -------------------------------------
F =ckbs_kuhn_tucker_L1(mu, s, y, r, b, d, Bdia, Hdia, Hlow, pPlus, pMinus);
% -------------------------------------
yVec = y(:);
dVec = d(:);
pPlusVec = pPlus(:);
pMinusVec = pMinus(:);
rVec = r(:);
sVec = s(:);
bVec = b(:);
Hy = ckbs_blktridiag_mul(Hdia, Hlow, y(:));
Bt_SmR = ckbs_blkdiag_mul_t(Bdia, s(:)-r(:));
By = ckbs_blkdiag_mul(Bdia, y(:));
check = [ ...
pPlusVec - pMinusVec - bVec - By; ...
pMinusVec.*sVec - mu;...
rVec + sVec - 2*sqrt(2); ...
pPlusVec.*rVec - mu; ...
Hy + dVec + 0.5*Bt_SmR; ...
];
ok = ok & ( max(abs(F - check)) < 1e-10 );
return
end
all_ok
10.1: test_path.m | Set Up Path for Testing |
function [ok] = all_ok()
test_path;
ok = true;
t0 = cputime;
ok = ok & one_ok('t_obj_ok');
ok = ok & one_ok('t_grad_ok');
ok = ok & one_ok('affine_ok_box');
ok = ok & one_ok('bidiag_solve_ok');
ok = ok & one_ok('bidiag_solve_t_ok');
ok = ok & one_ok('blkbidiag_symm_mul_ok');
ok = ok & one_ok('blkbidiag_mul_ok');
ok = ok & one_ok('blkbidiag_mul_t_ok');
ok = ok & one_ok('t_general_ok');
ok = ok & one_ok('L1_affine_ok');
ok = ok & one_ok('L1_nonlinear_ok');
ok = ok & one_ok('blkdiag_mul_ok');
ok = ok & one_ok('blkdiag_mul_t_ok');
ok = ok & one_ok('blktridiag_mul_ok');
ok = ok & one_ok('kuhn_tucker_ok');
ok = ok & one_ok('kuhn_tucker_L1_ok');
ok = ok & one_ok('newton_step_ok');
ok = ok & one_ok('newton_step_L1_ok');
ok = ok & one_ok('get_started_ok');
ok = ok & one_ok('sumsq_grad_ok');
ok = ok & one_ok('process_grad_ok');
ok = ok & one_ok('sumsq_hes_ok');
ok = ok & one_ok('process_hes_ok');
ok = ok & one_ok('sumsq_obj_ok');
ok = ok & one_ok('L2L1_obj_ok');
ok = ok & one_ok('tridiag_solve_ok');
ok = ok & one_ok('tridiag_solve_b_ok');
ok = ok & one_ok('tridiag_solve_pcg_ok');
ok = ok & one_ok('vanderpol_sim_ok');
ok = ok & one_ok('vanderpol_ok');
constraint={'no_constraint', 'box_constraint', 'sine_constraint'};
for c = 1 : 3
if sine_wave_ok(constraint{c}, false)
['Ok: sine_wave_ok ', constraint{c}]
else
['Error: sine_wave_ok ', constraint{c}]
ok = false
end
end
%
t1 = cputime;
if ok
disp(['All tests passed: cputime (secs) = ', num2str(t1-t0)]);
else
disp(['One or more tests failed: cputime (secs) = ', num2str(t1-t0)]);
end
return
end
function [ok] = one_ok(name)
ok = feval(name);
if ok
['Ok: ', name ]
else
['Error: ', name ]
end
return
end
test_path
../src
is added to the current path setting.
This is needed when running the tests in the test
subdirectory.
function test_path()
addpath('../src')
addpath('nonlinear')
end
aravkin
branch into the trunk:
----------------------------------------------------------------
A
,
the routine computes the block diagonal
of
A^{-1}
.
The work to compute this scales
linearly with the number of time points.
This routine is useful for obtaining
the a posteriori variance estimate from
a kalman smoothing fit.
src
directory and
9.20.1: tridiag_solve_b_ok.m
to example
directory.
This routine is an alternate version to
9.19: ckbs_tridiag_solve
currently used
by ckbs.
The new routine is used by the robust smoother, and in some
instances it may be more stable than the standard version.
src
directory and
9.12.1: blktridiag_mul_ok.m
to example
directory.
This utility routine is used by the robust
smoother, and also can be used to quickly
convert a block tridiagonal matrix from compact
to explicit form via multiplication with the identity.
----------------------------------------------------------------
example
directory to example/nonlinear
directory.
get_started_ok.m
.
nl_measure_ok.m
and nonlinear_ok_sin.m
examples.
example/nonlinear
and move
nonlinear_ok_simple.m
to nonlinear/get_started_ok.m
, and
nonlinear_ok_box.m
to nonlinear/nl_measure_ok.m
.
Improve the 4.1: get_started_ok.m
and nl_measure_ok.m
,
implementation and documentation.
cd example
octave
vanderpol_ok(true)
You can replace octave
by matlab
.
The argument true
instructs vanderpol_ok
to plot the results.
test
directory to example
test/affine_line_search_ok.m
which demonstrated a bug in 6: ckbs_affine
.
\alpha
which is included in the trace and info return.
ckbs_affine
return value 6.r: uOut
was a three dimensional array instead of a matrix.
This has been fixed.
nl_measure_ok.m
.
Improve the documentation of the initial state estimate in
4.g: g_fun
and
4.m: qinv
for ckbs_nonlinear
.
Improve the documentation of the initial state estimate in
4.g: g_fun
and
4.m: qinv
for ckbs_nonlinear
.
nl_measure_ok.m
and
for nonlinear_ok_sin.m
was changed
to group each component of velocity with the corresponding position
(instead of grouping position with position and velocity with velocity).
This made the variance matrix
Q_k
block diagonal
(simpler).
ckbs_affine
6.a: syntax
documentation.
This has been fixed.
ckbs_nonlinear
no longer has a penalty parameter; i.e., it has one less value per row.
10: all_ok.m
was extended so that if no argument is provided,
only the quick tests are run.
ckbs_affine
.
The 6.1: affine_ok_box.m
example has been changed to use the new
meaning for the arguments and return values of ckbs_affine
.
The itr_out return value has been removed from the
4: ckbs_nonlinear
syntax.
It was the same as size(
info, 1)
.
The following examples have been changed to remove the use of itr_out
in the calls to ckbs_nonlinear
:
4.1: get_started_ok.m
,
nl_measure_ok.m
,
nonlinear_ok_sin.m
.
The implementation of 4: ckbs_nonlinear
has been changed to use
the new calling sequence to 6: ckbs_affine
(this is implementation
detail and does not affect the user).
There was a syntax error in 4.1: get_started_ok.m
that
caused a problem in Matlab®, but not in octave.
This has been fixed.
ckbs_nonlinear.m
was removed to avoid this.
example/nonlinear/nl_measure_ok.m
and
example/nonlinear_ok_sin.m
were fixed.
ckbs
instead of ckbs-
yy-
mm-
dd.
This has been fixed.
nl_measure_ok.m
example.
nonlinear_ok_sin.m
was added.
It demonstrates significant improvement by
the inclusion of the constraints.
nl_measure_ok.m
example
was documented.
example/nonlinear_ok_sin.r
and
example/nonlinear/nl_measure_ok.r
were
added to the distribution.
(These are R source code files that are used to make plots from the
corresponding example output files.)
ckbs
has been changed to
4: ckbs_nonlinear
. This enables the
: ckbs
section to refer to the
documentation for the entire system.
\ell_1
Laplace Robust Kalman Smoother, to appear in IEEE TAC in November, 2011.
A | |
6.1: affine_ok_box.m | ckbs_affine Box Constrained Smoothing Spline Example and Test |
7.1: affine_singular_ok.m | ckbs_affine_singular Singular Smoothing Spline Example and Test |
10: all_ok.m | Run All Correctness Tests |
B | |
13: bib | Bibliography |
9.5.1: bidiag_solve_ok.m | ckbs_bidiag_solve Example and Test |
9.6.1: bidiag_solve_t_ok.m | ckbs_bidiag_solve_t Example and Test |
9.11.1: blkbidiag_mul_ok.m | blkbidiag_mul Example and Test |
9.10.1: blkbidiag_mul_t_ok.m | blkbidiag_mul_t Example and Test |
9.7.1: blkbidiag_symm_mul_ok.m | blkbidiag_symm_mul Example and Test |
9.8.1: blkdiag_mul_ok.m | blkdiag_mul Example and Test |
9.9.1: blkdiag_mul_t_ok.m | blkdiag_mul_t Example and Test |
9.12.1: blktridiag_mul_ok.m | blktridiag_mul Example and Test |
4.4.1: box_f.m | ckbs_nonlinear: Example of Box Constraints |
C | |
: ckbs | ckbs-0.20130204.0: Constrained/Robust Kalman-Bucy Smoothers |
6: ckbs_affine | Constrained Affine Kalman Bucy Smoother |
7: ckbs_affine_singular | Singular Affine Kalman Bucy Smoother |
9.5: ckbs_bidiag_solve | Block Bidiagonal Algorithm |
9.6: ckbs_bidiag_solve_t | Block Bidiagonal Algorithm |
9.11: ckbs_blkbidiag_mul | Packed Lower Block Bidiagonal Matrix Times a Vector |
9.10: ckbs_blkbidiag_mul_t | Packed Lower Block Bidiagonal Matrix Transpose Times a Vector |
9.7: ckbs_blkbidiag_symm_mul | Packed Block Bidiagonal Matrix Symmetric Multiply |
9.8: ckbs_blkdiag_mul | Packed Block Diagonal Matrix Times a Vector or Matrix |
9.9: ckbs_blkdiag_mul_t | Transpose of Packed Block Diagonal Matrix Times a Vector or Matrix |
9.12: ckbs_blktridiag_mul | Packed Block Tridiagonal Matrix Times a Vector |
9.4: ckbs_diag_solve | Block Diagonal Algorithm |
9.24: ckbs_kuhn_tucker | Compute Residual in Kuhn-Tucker Conditions |
9.25: ckbs_kuhn_tucker_L1 | Compute Residual in Kuhn-Tucker Conditions for Robust L1 |
8: ckbs_L1_affine | Robust L1 Affine Kalman Bucy Smoother |
5: ckbs_L1_nonlinear | The Nonlinear Constrained Kalman-Bucy Smoother |
9.14: ckbs_L2L1_obj | Affine Least Squares with Robust L1 Objective |
9.22: ckbs_newton_step | Affine Constrained Kalman Bucy Smoother Newton Step |
9.23: ckbs_newton_step_L1 | Affine Robust L1 Kalman Bucy Smoother Newton Step |
4: ckbs_nonlinear | The Nonlinear Constrained Kalman-Bucy Smoother |
9.16: ckbs_process_grad | Affine Residual Process Sum of Squares Gradient |
9.18: ckbs_process_hes | Affine Process Residual Sum of Squares Hessian |
9.15: ckbs_sumsq_grad | Affine Residual Sum of Squares Gradient |
9.17: ckbs_sumsq_hes | Affine Residual Sum of Squares Hessian |
9.13: ckbs_sumsq_obj | Affine Residual Sum of Squares Objective |
3: ckbs_t_general | The General Student's t Smoother |
9.2: ckbs_t_grad | Student's t Gradient |
9.3: ckbs_t_hess | Student's t Hessian |
9.1: ckbs_t_obj | Student's t Sum of Squares Objective |
9.19: ckbs_tridiag_solve | Symmetric Block Tridiagonal Algorithm |
9.20: ckbs_tridiag_solve_b | Symmetric Block Tridiagonal Algorithm (Backward version) |
9.21: ckbs_tridiag_solve_pcg | Symmetric Block Tridiagonal Algorithm (Conjugate Gradient version) |
D | |
9.4.1: diag_solve_ok.m | ckbs_diag_solve Example and Test |
4.4.2: direct_h.m | ckbs_nonlinear: Example Direct Measurement Model |
4.4.3: distance_h.m | ckbs_nonlinear: Example of Distance Measurement Model |
G | |
4.1: get_started_ok.m | ckbs_nonlinear: A Simple Example and Test |
K | |
9.25.1: kuhn_tucker_L1_ok.m | ckbs_kuhn_tucker_L1 Example and Test |
9.24.1: kuhn_tucker_ok.m | ckbs_kuhn_tucker Example and Test |
L | |
8.1: L1_affine_ok.m | ckbs_L1_affine Robust Smoothing Spline Example and Test |
5.1: L1_nonlinear_ok.m | ckbs_L1_nonlinear: Robust Nonlinear Transition Model Example and Test |
9.14.1: L2L1_obj_ok.m | ckbs_L2L1_obj Example and Test |
2: license | Your License to use the ckbs Software |
N | |
9.23.1: newton_step_L1_ok.m | ckbs_newton_step_L1 Example and Test |
9.22.1: newton_step_ok.m | ckbs_newton_step Example and Test |
4.4.4: no_f.m | ckbs_nonlinear: Example of No Constraint |
4.4: nonlinear_utility | ckbs_nonlinear: General Purpose Utilities |
P | |
4.4.5: persist_g.m | ckbs_nonlinear: Example of Persistence Transition Function |
4.4.6: pos_vel_g.m | ckbs_nonlinear: Example Position and Velocity Transition Model |
9.16.1: process_grad_ok.m | ckbs_process_grad Example and Test |
9.18.1: process_hes_ok.m | ckbs_process_hes Example and Test |
S | |
4.4.7: sine_f.m | ckbs_nonlinear: Example of Nonlinear Constraint |
4.2: sine_wave_ok.m | ckbs_nonlinear: Example and Test of Tracking a Sine Wave |
9.15.1: sumsq_grad_ok.m | ckbs_sumsq_grad Example and Test |
9.17.1: sumsq_hes_ok.m | ckbs_sumsq_hes Example and Test |
9.13.1: sumsq_obj_ok.m | ckbs_sumsq_obj Example and Test |
T | |
3.2: t_general_noisy_jump.m | ckbs_t_general Jump Tracking Example and Test |
3.1: t_general_ok.m | ckbs_t_general Jump Tracking Example and Test |
9.2.1: t_grad_ok.m | ckbs_t_grad Example and Test |
9.3.1: t_hess_ok.m | ckbs_t_hess Example and Test |
9.1.1: t_obj_ok.m | ckbs_t_obj Example and Test |
10.1: test_path.m | Set Up Path for Testing |
9.20.1: tridiag_solve_b_ok.m | ckbs_tridiag_solve_b Example and Test |
9.19.1: tridiag_solve_ok.m | ckbs_tridiag_solve Example and Test |
9.21.1: tridiag_solve_pcg_ok.m | ckbs_tridiag_solve_pcg Example and Test |
U | |
9: utility | ckbs Utility Functions |
V | |
4.3.2: vanderpol_g.m | ckbs_nonlinear: Vanderpol Transition Model Mean Example |
4.3: vanderpol_ok.m | ckbs_nonlinear: Unconstrained Nonlinear Transition Model Example and Test |
4.3.1: vanderpol_sim | Van der Pol Oscillator Simulation (No Noise) |
4.3.1.1: vanderpol_sim_ok.m | Example Use of vanderpol_sim |
W | |
11: whatsnew | Changes and Additions to ckbs |
12: wishlist | List of Future Improvements to ckbs |