CppAD: A C++ Algorithmic Differentiation Package  20171217
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
opt_val_hes.hpp
Go to the documentation of this file.
1 # ifndef CPPAD_CORE_OPT_VAL_HES_HPP
2 # define CPPAD_CORE_OPT_VAL_HES_HPP
3 
4 /* --------------------------------------------------------------------------
5 CppAD: C++ Algorithmic Differentiation: Copyright (C) 2003-17 Bradley M. Bell
6 
7 CppAD is distributed under multiple licenses. This distribution is under
8 the terms of the
9  Eclipse Public License Version 1.0.
10 
11 A copy of this license is included in the COPYING file of this distribution.
12 Please visit http://www.coin-or.org/CppAD/ for information on other licenses.
13 -------------------------------------------------------------------------- */
14 /*
15 $begin opt_val_hes$$
16 $spell
17  hes
18  sy
19  Jacobian
20  hes
21  signdet
22  jac
23  Bradley
24  const
25  CppAD
26 $$
27 
28 
29 
30 $section Jacobian and Hessian of Optimal Values$$
31 $mindex opt_val_hes$$
32 
33 $head Syntax$$
34 $icode%signdet% = opt_val_hes(%x%, %y%, %fun%, %jac%, %hes%)%$$
35 
36 $head See Also$$
37 $cref BenderQuad$$
38 
39 $head Reference$$
40 Algorithmic differentiation of implicit functions and optimal values,
41 Bradley M. Bell and James V. Burke, Advances in Automatic Differentiation,
42 2008, Springer.
43 
44 $head Purpose$$
45 We are given a function
46 $latex S : \B{R}^n \times \B{R}^m \rightarrow \B{R}^\ell$$
47 and we define $latex F : \B{R}^n \times \B{R}^m \rightarrow \B{R}$$
48 and $latex V : \B{R}^n \rightarrow \B{R} $$ by
49 $latex \[
50 \begin{array}{rcl}
51  F(x, y) & = & \sum_{k=0}^{\ell-1} S_k ( x , y)
52  \\
53  V(x) & = & F [ x , Y(x) ]
54  \\
55  0 & = & \partial_y F [x , Y(x) ]
56 \end{array}
57 \] $$
58 We wish to compute the Jacobian
59 and possibly also the Hessian, of $latex V (x)$$.
60 
61 $head BaseVector$$
62 The type $icode BaseVector$$ must be a
63 $cref SimpleVector$$ class.
64 We use $icode Base$$ to refer to the type of the elements of
65 $icode BaseVector$$; i.e.,
66 $codei%
67  %BaseVector%::value_type
68 %$$
69 
70 $head x$$
71 The argument $icode x$$ has prototype
72 $codei%
73  const %BaseVector%& %x%
74 %$$
75 and its size must be equal to $icode n$$.
76 It specifies the point at which we evaluating
77 the Jacobian $latex V^{(1)} (x)$$
78 (and possibly the Hessian $latex V^{(2)} (x)$$).
79 
80 
81 $head y$$
82 The argument $icode y$$ has prototype
83 $codei%
84  const %BaseVector%& %y%
85 %$$
86 and its size must be equal to $icode m$$.
87 It must be equal to $latex Y(x)$$; i.e.,
88 it must solve the implicit equation
89 $latex \[
90  0 = \partial_y F ( x , y)
91 \] $$
92 
93 $head Fun$$
94 The argument $icode fun$$ is an object of type $icode Fun$$
95 which must support the member functions listed below.
96 CppAD will may be recording operations of the type $codei%AD<%Base%>%$$
97 when these member functions are called.
98 These member functions must not stop such a recording; e.g.,
99 they must not call $cref/AD<Base>::abort_recording/abort_recording/$$.
100 
101 $subhead Fun::ad_vector$$
102 The type $icode%Fun%::ad_vector%$$ must be a
103 $cref SimpleVector$$ class with elements of type $codei%AD<%Base%>%$$; i.e.
104 $codei%
105  %Fun%::ad_vector::value_type
106 %$$
107 is equal to $codei%AD<%Base%>%$$.
108 
109 $subhead fun.ell$$
110 The type $icode Fun$$ must support the syntax
111 $codei%
112  %ell% = %fun%.ell()
113 %$$
114 where $icode ell$$ has prototype
115 $codei%
116  size_t %ell%
117 %$$
118 and is the value of $latex \ell$$; i.e.,
119 the number of terms in the summation.
120 $pre
121 
122 $$
123 One can choose $icode ell$$ equal to one, and have
124 $latex S(x,y)$$ the same as $latex F(x, y)$$.
125 Each of the functions $latex S_k (x , y)$$,
126 (in the summation defining $latex F(x, y)$$)
127 is differentiated separately using AD.
128 For very large problems, breaking $latex F(x, y)$$ into the sum
129 of separate simpler functions may reduce the amount of memory necessary for
130 algorithmic differentiation and there by speed up the process.
131 
132 $subhead fun.s$$
133 The type $icode Fun$$ must support the syntax
134 $codei%
135  %s_k% = %fun%.s(%k%, %x%, %y%)
136 %$$
137 The $icode%fun%.s%$$ argument $icode k$$ has prototype
138 $codei%
139  size_t %k%
140 %$$
141 and is between zero and $icode%ell% - 1%$$.
142 The argument $icode x$$ to $icode%fun%.s%$$ has prototype
143 $codei%
144  const %Fun%::ad_vector& %x%
145 %$$
146 and its size must be equal to $icode n$$.
147 The argument $icode y$$ to $icode%fun%.s%$$ has prototype
148 $codei%
149  const %Fun%::ad_vector& %y%
150 %$$
151 and its size must be equal to $icode m$$.
152 The $icode%fun%.s%$$ result $icode s_k$$ has prototype
153 $codei%
154  AD<%Base%> %s_k%
155 %$$
156 and its value must be given by $latex s_k = S_k ( x , y )$$.
157 
158 $subhead fun.sy$$
159 The type $icode Fun$$ must support the syntax
160 $codei%
161  %sy_k% = %fun%.sy(%k%, %x%, %y%)
162 %$$
163 The argument $icode k$$ to $icode%fun%.sy%$$ has prototype
164 $codei%
165  size_t %k%
166 %$$
167 The argument $icode x$$ to $icode%fun%.sy%$$ has prototype
168 $codei%
169  const %Fun%::ad_vector& %x%
170 %$$
171 and its size must be equal to $icode n$$.
172 The argument $icode y$$ to $icode%fun%.sy%$$ has prototype
173 $codei%
174  const %Fun%::ad_vector& %y%
175 %$$
176 and its size must be equal to $icode m$$.
177 The $icode%fun%.sy%$$ result $icode sy_k$$ has prototype
178 $codei%
179  %Fun%::ad_vector %sy_k%
180 %$$
181 its size must be equal to $icode m$$,
182 and its value must be given by $latex sy_k = \partial_y S_k ( x , y )$$.
183 
184 $head jac$$
185 The argument $icode jac$$ has prototype
186 $codei%
187  %BaseVector%& %jac%
188 %$$
189 and has size $icode n$$ or zero.
190 The input values of its elements do not matter.
191 If it has size zero, it is not affected. Otherwise, on output
192 it contains the Jacobian of $latex V (x)$$; i.e.,
193 for $latex j = 0 , \ldots , n-1$$,
194 $latex \[
195  jac[ j ] = V^{(1)} (x)_j
196 \] $$
197 where $icode x$$ is the first argument to $code opt_val_hes$$.
198 
199 $head hes$$
200 The argument $icode hes$$ has prototype
201 $codei%
202  %BaseVector%& %hes%
203 %$$
204 and has size $icode%n% * %n%$$ or zero.
205 The input values of its elements do not matter.
206 If it has size zero, it is not affected. Otherwise, on output
207 it contains the Hessian of $latex V (x)$$; i.e.,
208 for $latex i = 0 , \ldots , n-1$$, and
209 $latex j = 0 , \ldots , n-1$$,
210 $latex \[
211  hes[ i * n + j ] = V^{(2)} (x)_{i,j}
212 \] $$
213 
214 
215 $head signdet$$
216 If $icode%hes%$$ has size zero, $icode signdet$$ is not defined.
217 Otherwise
218 the return value $icode signdet$$ is the sign of the determinant for
219 $latex \partial_{yy}^2 F(x , y) $$.
220 If it is zero, then the matrix is singular and
221 the Hessian is not computed ($icode hes$$ is not changed).
222 
223 $head Example$$
224 $children%
225  example/general/opt_val_hes.cpp
226 %$$
227 The file
228 $cref opt_val_hes.cpp$$
229 contains an example and test of this operation.
230 It returns true if it succeeds and false otherwise.
231 
232 $end
233 -----------------------------------------------------------------------------
234 */
235 
236 namespace CppAD { // BEGIN_CPPAD_NAMESPACE
237 /*!
238 \file opt_val_hes.hpp
239 \brief Computing Jabobians and Hessians of Optimal Values
240 */
241 
242 /*!
243 Computing Jabobians and Hessians of Optimal Values
244 
245 We are given a function
246 \f$ S : {\rm R}^n \times {\rm R}^m \rightarrow {\rm R}^\ell \f$
247 and we define \f$ F : {\rm R}^n \times {\rm R}^m \rightarrow {\rm R} \f$
248 and \f$ V : {\rm R}^n \rightarrow {\rm R} \f$ by
249 \f[
250 \begin{array}{rcl}
251  F(x, y) & = & \sum_{k=0}^{\ell-1} S_k ( x , y)
252  \\
253  V(x) & = & F [ x , Y(x) ]
254  \\
255  0 & = & \partial_y F [x , Y(x) ]
256 \end{array}
257 \f]
258 We wish to compute the Jacobian
259 and possibly also the Hessian, of \f$ V (x) \f$.
260 
261 \tparam BaseVector
262 The type \c BaseVector must be a SimpleVector class.
263 We use \c Base to refer to the type of the elements of
264 \c BaseVector; i.e.,
265 <tt>BaseVector::value_type</tt>.
266 
267 \param x
268 is a vector with size \c n.
269 It specifies the point at which we evaluating
270 the Jacobian \f$ V^{(1)} (x) \f$
271 (and possibly the Hessian \f$ V^{(2)} (x) \f$).
272 
273 
274 \param y
275 is a vector with size \c m.
276 It must be equal to \f$ Y(x) \f$; i.e.,
277 it must solve the implicit equation
278 \f[
279  0 = \partial_y F ( x , y)
280 \f]
281 
282 \param fun
283 The argument \c fun is an object of type \c Fun
284 wich must support the member functions listed below.
285 CppAD will may be recording operations of the type \c AD<Base>
286 when these member functions are called.
287 These member functions must not stop such a recording; e.g.,
288 they must not call \c AD<Base>::abort_recording.
289 
290 \par Fun::ad_vector</tt>
291 The type <tt>Fun::ad_vector</tt> must be a
292 SimpleVector class with elements of type \c AD<Base>; i.e.
293 <tt>Fun::ad_vector::value_type</tt>
294 is equal to \c AD<Base>.
295 
296 \par fun.ell
297 the type \c Fun must support the syntax
298 \verbatim
299  ell = fun.ell()
300 \endverbatim
301 where \c ell is a \c size_t value that is set to \f$ \ell \f$; i.e.,
302 the number of terms in the summation.
303 
304 \par fun.s
305 The type \c Fun must support the syntax
306 \verbatim
307  s_k = fun.s(k, x, y)
308 \endverbatim
309 The argument \c k has prototype <tt>size_t k</tt>.
310 The argument \c x has prototype <tt>const Fun::ad_vector& x</tt>
311 and its size must be equal to \c n.
312 The argument \c y has prototype <tt>const Fun::ad_vector& y</tt>
313 and its size must be equal to \c m.
314 The return value \c s_k has prototype \c AD<Base> s_k
315 and its value must be given by \f$ s_k = S_k ( x , y ) \f$.
316 
317 \par fun.sy
318 The type \c Fun must support the syntax
319 \verbatim
320  sy_k = fun.sy(k, x, y)
321 \endverbatim
322 The argument \c k has prototype <tt>size_t k</tt>.
323 The argument \c x has prototype <tt>const Fun::ad_vector& x</tt>
324 and its size must be equal to \c n.
325 The argument \c y has prototype <tt>const Fun::ad_vector& y</tt>
326 and its size must be equal to \c m.
327 The return value \c sy_k has prototype <tt>Fun::ad_vector& sy_k</tt>,
328 its size is \c m
329 and its value must be given by \f$ sy_k = \partial_y S_k ( x , y ) \f$.
330 
331 \param jac
332 is a vector with size \c n or zero.
333 The input values of its elements do not matter.
334 If it has size zero, it is not affected. Otherwise, on output
335 it contains the Jacobian of \f$ V (x) \f$; i.e.,
336 for \f$ j = 0 , \ldots , n-1 \f$,
337 \f[
338  jac[ j ] = V^{(1)} (x)_j
339 \f] $$
340 where \c x is the first argument to \c opt_val_hes.
341 
342 \param hes
343 is a vector with size <tt>n * n</tt> or zero.
344 The input values of its elements do not matter.
345 If it has size zero, it is not affected. Otherwise, on output
346 it contains the Hessian of \f$ V (x) \f$; i.e.,
347 for \f$ i = 0 , \ldots , n-1 \f$, and
348 \f$ j = 0 , \ldots , n-1 \f$,
349 \f[
350  hes[ i * n + j ] = V^{(2)} (x)_{i,j}
351 \f]
352 
353 \return
354 If <tt>hes.size() == 0</tt>, the return value is not defined.
355 Otherwise,
356 the return value is the sign of the determinant for
357 \f$ \partial_{yy}^2 F(x , y) \f$$.
358 If it is zero, then the matrix is singular and \c hes is not set
359 to its specified value.
360 */
361 
362 
363 template <class BaseVector, class Fun>
365  const BaseVector& x ,
366  const BaseVector& y ,
367  Fun fun ,
368  BaseVector& jac ,
369  BaseVector& hes )
370 { // determine the base type
371  typedef typename BaseVector::value_type Base;
372 
373  // check that BaseVector is a SimpleVector class with Base elements
374  CheckSimpleVector<Base, BaseVector>();
375 
376  // determine the AD vector type
377  typedef typename Fun::ad_vector ad_vector;
378 
379  // check that ad_vector is a SimpleVector class with AD<Base> elements
380  CheckSimpleVector< AD<Base> , ad_vector >();
381 
382  // size of the x and y spaces
383  size_t n = size_t(x.size());
384  size_t m = size_t(y.size());
385 
386  // number of terms in the summation
387  size_t ell = fun.ell();
388 
389  // check size of return values
391  size_t(jac.size()) == n || jac.size() == 0,
392  "opt_val_hes: size of the vector jac is not equal to n or zero"
393  );
395  size_t(hes.size()) == n * n || hes.size() == 0,
396  "opt_val_hes: size of the vector hes is not equal to n * n or zero"
397  );
398 
399  // some temporary indices
400  size_t i, j, k;
401 
402  // AD version of S_k(x, y)
403  ad_vector s_k(1);
404 
405  // ADFun version of S_k(x, y)
406  ADFun<Base> S_k;
407 
408  // AD version of x
409  ad_vector a_x(n);
410 
411  // AD version of y
412  ad_vector a_y(n);
413 
414  if( jac.size() > 0 )
415  { // this is the easy part, computing the V^{(1)} (x) which is equal
416  // to \partial_x F (x, y) (see Thoerem 2 of the reference).
417 
418  // copy x and y to AD version
419  for(j = 0; j < n; j++)
420  a_x[j] = x[j];
421  for(j = 0; j < m; j++)
422  a_y[j] = y[j];
423 
424  // initialize summation
425  for(j = 0; j < n; j++)
426  jac[j] = Base(0.);
427 
428  // add in \partial_x S_k (x, y)
429  for(k = 0; k < ell; k++)
430  { // start recording
431  Independent(a_x);
432  // record
433  s_k[0] = fun.s(k, a_x, a_y);
434  // stop recording and store in S_k
435  S_k.Dependent(a_x, s_k);
436  // compute partial of S_k with respect to x
437  BaseVector jac_k = S_k.Jacobian(x);
438  // add \partial_x S_k (x, y) to jac
439  for(j = 0; j < n; j++)
440  jac[j] += jac_k[j];
441  }
442  }
443  // check if we are done
444  if( hes.size() == 0 )
445  return 0;
446 
447  /*
448  In this case, we need to compute the Hessian. Using Theorem 1 of the
449  reference:
450  Y^{(1)}(x) = - F_yy (x, y)^{-1} F_yx (x, y)
451  Using Theorem 2 of the reference:
452  V^{(2)}(x) = F_xx (x, y) + F_xy (x, y) Y^{(1)}(x)
453  */
454  // Base and AD version of xy
455  BaseVector xy(n + m);
456  ad_vector a_xy(n + m);
457  for(j = 0; j < n; j++)
458  a_xy[j] = xy[j] = x[j];
459  for(j = 0; j < m; j++)
460  a_xy[n+j] = xy[n+j] = y[j];
461 
462  // Initialization summation for Hessian of F
463  size_t nm_sq = (n + m) * (n + m);
464  BaseVector F_hes(nm_sq);
465  for(j = 0; j < nm_sq; j++)
466  F_hes[j] = Base(0.);
467  BaseVector hes_k(nm_sq);
468 
469  // add in Hessian of S_k to hes
470  for(k = 0; k < ell; k++)
471  { // start recording
472  Independent(a_xy);
473  // split out x
474  for(j = 0; j < n; j++)
475  a_x[j] = a_xy[j];
476  // split out y
477  for(j = 0; j < m; j++)
478  a_y[j] = a_xy[n+j];
479  // record
480  s_k[0] = fun.s(k, a_x, a_y);
481  // stop recording and store in S_k
482  S_k.Dependent(a_xy, s_k);
483  // when computing the Hessian it pays to optimize the tape
484  S_k.optimize();
485  // compute Hessian of S_k
486  hes_k = S_k.Hessian(xy, 0);
487  // add \partial_x S_k (x, y) to jac
488  for(j = 0; j < nm_sq; j++)
489  F_hes[j] += hes_k[j];
490  }
491  // Extract F_yx
492  BaseVector F_yx(m * n);
493  for(i = 0; i < m; i++)
494  { for(j = 0; j < n; j++)
495  F_yx[i * n + j] = F_hes[ (i+n)*(n+m) + j ];
496  }
497  // Extract F_yy
498  BaseVector F_yy(n * m);
499  for(i = 0; i < m; i++)
500  { for(j = 0; j < m; j++)
501  F_yy[i * m + j] = F_hes[ (i+n)*(n+m) + j + n ];
502  }
503 
504  // compute - Y^{(1)}(x) = F_yy (x, y)^{-1} F_yx (x, y)
505  BaseVector neg_Y_x(m * n);
506  Base logdet;
507  int signdet = CppAD::LuSolve(m, n, F_yy, F_yx, neg_Y_x, logdet);
508  if( signdet == 0 )
509  return signdet;
510 
511  // compute hes = F_xx (x, y) + F_xy (x, y) Y^{(1)}(x)
512  for(i = 0; i < n; i++)
513  { for(j = 0; j < n; j++)
514  { hes[i * n + j] = F_hes[ i*(n+m) + j ];
515  for(k = 0; k < m; k++)
516  hes[i*n+j] -= F_hes[i*(n+m) + k+n] * neg_Y_x[k*n+j];
517  }
518  }
519  return signdet;
520 }
521 
522 } // END_CPPAD_NAMESPACE
523 
524 # endif
#define CPPAD_ASSERT_KNOWN(exp, msg)
Check that exp is true, if not print msg and terminate execution.
Class used to hold function objects.
Definition: ad_fun.hpp:69
int LuSolve(size_t n, size_t m, const FloatVector &A, const FloatVector &B, FloatVector &X, Float &logdet)
Definition: lu_solve.hpp:266
int opt_val_hes(const BaseVector &x, const BaseVector &y, Fun fun, BaseVector &jac, BaseVector &hes)
Computing Jabobians and Hessians of Optimal Values.
VectorBase Jacobian(const VectorBase &x)
calculate entire Jacobian
void Independent(VectorAD &x, size_t abort_op_index)
Declaration of independent variables.
void Dependent(local::ADTape< Base > *tape, const ADvector &y)
change the operation sequence corresponding to this object
Definition: dependent.hpp:236
void optimize(const std::string &options="")
Optimize a player object operation sequence.
Definition: optimize.hpp:216
VectorBase Hessian(const VectorBase &x, const VectorBase &w)
calculate Hessian for one component of f
Scalar value_type