CppAD: A C++ Algorithmic Differentiation Package  20171217
opt_val_hes.hpp
Go to the documentation of this file.
3
4 /* --------------------------------------------------------------------------
6
8 the terms of the
9  Eclipse Public License Version 1.0.
10
11 A copy of this license is included in the COPYING file of this distribution.
13 -------------------------------------------------------------------------- */
14 /*
15 $begin opt_val_hes$$16 spell 17 hes 18 sy 19 Jacobian 20 hes 21 signdet 22 jac 23 Bradley 24 const 25 CppAD 26$$ 27 28 29 30$section Jacobian and Hessian of Optimal Values$$31 mindex opt_val_hes$$
32
33 $head Syntax$$34 icode%signdet% = opt_val_hes(%x%, %y%, %fun%, %jac%, %hes%)%$$ 35 36$head See Also$$37 cref BenderQuad$$
38
39 $head Reference$$40 Algorithmic differentiation of implicit functions and optimal values, 41 Bradley M. Bell and James V. Burke, Advances in Automatic Differentiation, 42 2008, Springer. 43 44 head Purpose$$ 45 We are given a function 46$latex S : \B{R}^n \times \B{R}^m \rightarrow \B{R}^\ell$$47 and we define latex F : \B{R}^n \times \B{R}^m \rightarrow \B{R}$$
48 and $latex V : \B{R}^n \rightarrow \B{R} $$by 49 latex $50 \begin{array}{rcl} 51 F(x, y) & = & \sum_{k=0}^{\ell-1} S_k ( x , y) 52 \\ 53 V(x) & = & F [ x , Y(x) ] 54 \\ 55 0 & = & \partial_y F [x , Y(x) ] 56 \end{array} 57$$$ 58 We wish to compute the Jacobian 59 and possibly also the Hessian, of$latex V (x)$$. 60 61 head BaseVector$$
62 The type $icode BaseVector$$must be a 63 cref SimpleVector$$ class. 64 We use$icode Base$$to refer to the type of the elements of 65 icode BaseVector$$; i.e.,
66 $codei% 67 %BaseVector%::value_type 68 %$$69 70 head x$$ 71 The argument$icode x$$has prototype 72 codei% 73 const %BaseVector%& %x% 74 %$$
75 and its size must be equal to $icode n$$. 76 It specifies the point at which we evaluating 77 the Jacobian latex V^{(1)} (x)$$ 78 (and possibly the Hessian$latex V^{(2)} (x)$$). 79 80 81 head y$$
82 The argument $icode y$$has prototype 83 codei% 84 const %BaseVector%& %y% 85 %$$ 86 and its size must be equal to$icode m$$. 87 It must be equal to latex Y(x)$$; i.e.,
88 it must solve the implicit equation
89 $latex $90 0 = \partial_y F ( x , y) 91$ $$92 93 head Fun$$ 94 The argument$icode fun$$is an object of type icode Fun$$
95 which must support the member functions listed below.
96 CppAD will may be recording operations of the type $codei%AD<%Base%>%$$97 when these member functions are called. 98 These member functions must not stop such a recording; e.g., 99 they must not call cref/AD<Base>::abort_recording/abort_recording/$$. 100 101$subhead Fun::ad_vector$$102 The type icode%Fun%::ad_vector%$$ must be a
103 $cref SimpleVector$$class with elements of type codei%AD<%Base%>%$$; i.e. 104$codei%
106 %$$107 is equal to codei%AD<%Base%>%$$.
108
109 $subhead fun.ell$$110 The type icode Fun$$ must support the syntax 111$codei%
112  %ell% = %fun%.ell()
113 %$$114 where icode ell$$ has prototype
115 $codei% 116 size_t %ell% 117 %$$118 and is the value of latex \ell$$; i.e., 119 the number of terms in the summation. 120$pre
121
122 $$123 One can choose icode ell$$ equal to one, and have
124 $latex S(x,y)$$the same as latex F(x, y)$$. 125 Each of the functions$latex S_k (x , y)$$, 126 (in the summation defining latex F(x, y)$$)
127 is differentiated separately using AD.
128 For very large problems, breaking $latex F(x, y)$$into the sum 129 of separate simpler functions may reduce the amount of memory necessary for 130 algorithmic differentiation and there by speed up the process. 131 132 subhead fun.s$$ 133 The type$icode Fun$$must support the syntax 134 codei% 135 %s_k% = %fun%.s(%k%, %x%, %y%) 136 %$$
137 The $icode%fun%.s%$$argument icode k$$ has prototype 138$codei%
139  size_t %k%
140 %$$141 and is between zero and icode%ell% - 1%$$.
142 The argument $icode x$$to icode%fun%.s%$$ has prototype 143$codei%
145 %$$146 and its size must be equal to icode n$$.
147 The argument $icode y$$to icode%fun%.s%$$ has prototype 148$codei%
150 %$$151 and its size must be equal to icode m$$.
152 The $icode%fun%.s%$$result icode s_k$$ has prototype 153$codei%
155 %$$156 and its value must be given by latex s_k = S_k ( x , y )$$.
157
158 $subhead fun.sy$$159 The type icode Fun$$ must support the syntax 160$codei%
161  %sy_k% = %fun%.sy(%k%, %x%, %y%)
162 %$$163 The argument icode k$$ to $icode%fun%.sy%$$has prototype 164 codei% 165 size_t %k% 166 %$$ 167 The argument$icode x$$to icode%fun%.sy%$$ has prototype
168 $codei% 169 const %Fun%::ad_vector& %x% 170 %$$171 and its size must be equal to icode n$$. 172 The argument$icode y$$to icode%fun%.sy%$$ has prototype
173 $codei% 174 const %Fun%::ad_vector& %y% 175 %$$176 and its size must be equal to icode m$$. 177 The$icode%fun%.sy%$$result icode sy_k$$ has prototype
178 $codei% 179 %Fun%::ad_vector %sy_k% 180 %$$181 its size must be equal to icode m$$, 182 and its value must be given by$latex sy_k = \partial_y S_k ( x , y )$$. 183 184 head jac$$
185 The argument $icode jac$$has prototype 186 codei% 187 %BaseVector%& %jac% 188 %$$ 189 and has size$icode n$$or zero. 190 The input values of its elements do not matter. 191 If it has size zero, it is not affected. Otherwise, on output 192 it contains the Jacobian of latex V (x)$$; i.e.,
193 for $latex j = 0 , \ldots , n-1$$, 194 latex $195 jac[ j ] = V^{(1)} (x)_j 196$$$ 197 where$icode x$$is the first argument to code opt_val_hes$$.
198
199 $head hes$$200 The argument icode hes$$ has prototype 201$codei%
202  %BaseVector%& %hes%
203 %$$204 and has size icode%n% * %n%$$ or zero.
205 The input values of its elements do not matter.
206 If it has size zero, it is not affected. Otherwise, on output
207 it contains the Hessian of $latex V (x)$$; i.e., 208 for latex i = 0 , \ldots , n-1$$, and 209$latex j = 0 , \ldots , n-1$$, 210 latex $211 hes[ i * n + j ] = V^{(2)} (x)_{i,j} 212$$$
213
214
215 $head signdet$$216 If icode%hes%$$ has size zero,$icode signdet$$is not defined. 217 Otherwise 218 the return value icode signdet$$ is the sign of the determinant for
219 $latex \partial_{yy}^2 F(x , y) $$. 220 If it is zero, then the matrix is singular and 221 the Hessian is not computed (icode hes$$ is not changed). 222 223$head Example$$224 children% 225 example/general/opt_val_hes.cpp 226 %$$
227 The file
228 $cref opt_val_hes.cpp$$229 contains an example and test of this operation. 230 It returns true if it succeeds and false otherwise. 231 232 end 233 ----------------------------------------------------------------------------- 234 */ 235 236 namespace CppAD { // BEGIN_CPPAD_NAMESPACE 237 /*! 238 \file opt_val_hes.hpp 239 \brief Computing Jabobians and Hessians of Optimal Values 240 */ 241 242 /*! 243 Computing Jabobians and Hessians of Optimal Values 244 245 We are given a function 246 \f S : {\rm R}^n \times {\rm R}^m \rightarrow {\rm R}^\ell \f 247 and we define \f F : {\rm R}^n \times {\rm R}^m \rightarrow {\rm R} \f 248 and \f V : {\rm R}^n \rightarrow {\rm R} \f by 249 \f[ 250 \begin{array}{rcl} 251 F(x, y) & = & \sum_{k=0}^{\ell-1} S_k ( x , y) 252 \\ 253 V(x) & = & F [ x , Y(x) ] 254 \\ 255 0 & = & \partial_y F [x , Y(x) ] 256 \end{array} 257 \f] 258 We wish to compute the Jacobian 259 and possibly also the Hessian, of \f V (x) \f. 260 261 \tparam BaseVector 262 The type \c BaseVector must be a SimpleVector class. 263 We use \c Base to refer to the type of the elements of 264 \c BaseVector; i.e., 265 <tt>BaseVector::value_type</tt>. 266 267 \param x 268 is a vector with size \c n. 269 It specifies the point at which we evaluating 270 the Jacobian \f V^{(1)} (x) \f 271 (and possibly the Hessian \f V^{(2)} (x) \f). 272 273 274 \param y 275 is a vector with size \c m. 276 It must be equal to \f Y(x) \f; i.e., 277 it must solve the implicit equation 278 \f[ 279 0 = \partial_y F ( x , y) 280 \f] 281 282 \param fun 283 The argument \c fun is an object of type \c Fun 284 wich must support the member functions listed below. 285 CppAD will may be recording operations of the type \c AD<Base> 286 when these member functions are called. 287 These member functions must not stop such a recording; e.g., 288 they must not call \c AD<Base>::abort_recording. 289 290 \par Fun::ad_vector</tt> 291 The type <tt>Fun::ad_vector</tt> must be a 292 SimpleVector class with elements of type \c AD<Base>; i.e. 293 <tt>Fun::ad_vector::value_type</tt> 294 is equal to \c AD<Base>. 295 296 \par fun.ell 297 the type \c Fun must support the syntax 298 \verbatim 299 ell = fun.ell() 300 \endverbatim 301 where \c ell is a \c size_t value that is set to \f \ell \f; i.e., 302 the number of terms in the summation. 303 304 \par fun.s 305 The type \c Fun must support the syntax 306 \verbatim 307 s_k = fun.s(k, x, y) 308 \endverbatim 309 The argument \c k has prototype <tt>size_t k</tt>. 310 The argument \c x has prototype <tt>const Fun::ad_vector& x</tt> 311 and its size must be equal to \c n. 312 The argument \c y has prototype <tt>const Fun::ad_vector& y</tt> 313 and its size must be equal to \c m. 314 The return value \c s_k has prototype \c AD<Base> s_k 315 and its value must be given by \f s_k = S_k ( x , y ) \f. 316 317 \par fun.sy 318 The type \c Fun must support the syntax 319 \verbatim 320 sy_k = fun.sy(k, x, y) 321 \endverbatim 322 The argument \c k has prototype <tt>size_t k</tt>. 323 The argument \c x has prototype <tt>const Fun::ad_vector& x</tt> 324 and its size must be equal to \c n. 325 The argument \c y has prototype <tt>const Fun::ad_vector& y</tt> 326 and its size must be equal to \c m. 327 The return value \c sy_k has prototype <tt>Fun::ad_vector& sy_k</tt>, 328 its size is \c m 329 and its value must be given by \f sy_k = \partial_y S_k ( x , y ) \f. 330 331 \param jac 332 is a vector with size \c n or zero. 333 The input values of its elements do not matter. 334 If it has size zero, it is not affected. Otherwise, on output 335 it contains the Jacobian of \f V (x) \f; i.e., 336 for \f j = 0 , \ldots , n-1 \f, 337 \f[ 338 jac[ j ] = V^{(1)} (x)_j 339 \f]$$ 340 where \c x is the first argument to \c opt_val_hes. 341 342 \param hes 343 is a vector with size <tt>n * n</tt> or zero. 344 The input values of its elements do not matter. 345 If it has size zero, it is not affected. Otherwise, on output 346 it contains the Hessian of \f$ V (x) \f$; i.e., 347 for \f$ i = 0 , \ldots , n-1 \f$, and 348 \f$ j = 0 , \ldots , n-1 \f$, 349 \f[ 350 hes[ i * n + j ] = V^{(2)} (x)_{i,j} 351 \f] 352 353 \return 354 If <tt>hes.size() == 0</tt>, the return value is not defined. 355 Otherwise, 356 the return value is the sign of the determinant for 357 \f$ \partial_{yy}^2 F(x , y) \f.
358 If it is zero, then the matrix is singular and \c hes is not set
359 to its specified value.
360 */
361
362
363 template <class BaseVector, class Fun>
365  const BaseVector& x ,
366  const BaseVector& y ,
367  Fun fun ,
368  BaseVector& jac ,
369  BaseVector& hes )
370 { // determine the base type
371  typedef typename BaseVector::value_type Base;
372
373  // check that BaseVector is a SimpleVector class with Base elements
374  CheckSimpleVector<Base, BaseVector>();
375
376  // determine the AD vector type
378
379  // check that ad_vector is a SimpleVector class with AD<Base> elements
381
382  // size of the x and y spaces
383  size_t n = size_t(x.size());
384  size_t m = size_t(y.size());
385
386  // number of terms in the summation
387  size_t ell = fun.ell();
388
389  // check size of return values
391  size_t(jac.size()) == n || jac.size() == 0,
392  "opt_val_hes: size of the vector jac is not equal to n or zero"
393  );
395  size_t(hes.size()) == n * n || hes.size() == 0,
396  "opt_val_hes: size of the vector hes is not equal to n * n or zero"
397  );
398
399  // some temporary indices
400  size_t i, j, k;
401
402  // AD version of S_k(x, y)
404
405  // ADFun version of S_k(x, y)
407
408  // AD version of x
410
411  // AD version of y
413
414  if( jac.size() > 0 )
415  { // this is the easy part, computing the V^{(1)} (x) which is equal
416  // to \partial_x F (x, y) (see Thoerem 2 of the reference).
417
418  // copy x and y to AD version
419  for(j = 0; j < n; j++)
420  a_x[j] = x[j];
421  for(j = 0; j < m; j++)
422  a_y[j] = y[j];
423
424  // initialize summation
425  for(j = 0; j < n; j++)
426  jac[j] = Base(0.);
427
428  // add in \partial_x S_k (x, y)
429  for(k = 0; k < ell; k++)
430  { // start recording
431  Independent(a_x);
432  // record
433  s_k[0] = fun.s(k, a_x, a_y);
434  // stop recording and store in S_k
435  S_k.Dependent(a_x, s_k);
436  // compute partial of S_k with respect to x
437  BaseVector jac_k = S_k.Jacobian(x);
438  // add \partial_x S_k (x, y) to jac
439  for(j = 0; j < n; j++)
440  jac[j] += jac_k[j];
441  }
442  }
443  // check if we are done
444  if( hes.size() == 0 )
445  return 0;
446
447  /*
448  In this case, we need to compute the Hessian. Using Theorem 1 of the
449  reference:
450  Y^{(1)}(x) = - F_yy (x, y)^{-1} F_yx (x, y)
451  Using Theorem 2 of the reference:
452  V^{(2)}(x) = F_xx (x, y) + F_xy (x, y) Y^{(1)}(x)
453  */
454  // Base and AD version of xy
455  BaseVector xy(n + m);
457  for(j = 0; j < n; j++)
458  a_xy[j] = xy[j] = x[j];
459  for(j = 0; j < m; j++)
460  a_xy[n+j] = xy[n+j] = y[j];
461
462  // Initialization summation for Hessian of F
463  size_t nm_sq = (n + m) * (n + m);
464  BaseVector F_hes(nm_sq);
465  for(j = 0; j < nm_sq; j++)
466  F_hes[j] = Base(0.);
467  BaseVector hes_k(nm_sq);
468
469  // add in Hessian of S_k to hes
470  for(k = 0; k < ell; k++)
471  { // start recording
472  Independent(a_xy);
473  // split out x
474  for(j = 0; j < n; j++)
475  a_x[j] = a_xy[j];
476  // split out y
477  for(j = 0; j < m; j++)
478  a_y[j] = a_xy[n+j];
479  // record
480  s_k[0] = fun.s(k, a_x, a_y);
481  // stop recording and store in S_k
482  S_k.Dependent(a_xy, s_k);
483  // when computing the Hessian it pays to optimize the tape
484  S_k.optimize();
485  // compute Hessian of S_k
486  hes_k = S_k.Hessian(xy, 0);
487  // add \partial_x S_k (x, y) to jac
488  for(j = 0; j < nm_sq; j++)
489  F_hes[j] += hes_k[j];
490  }
491  // Extract F_yx
492  BaseVector F_yx(m * n);
493  for(i = 0; i < m; i++)
494  { for(j = 0; j < n; j++)
495  F_yx[i * n + j] = F_hes[ (i+n)*(n+m) + j ];
496  }
497  // Extract F_yy
498  BaseVector F_yy(n * m);
499  for(i = 0; i < m; i++)
500  { for(j = 0; j < m; j++)
501  F_yy[i * m + j] = F_hes[ (i+n)*(n+m) + j + n ];
502  }
503
504  // compute - Y^{(1)}(x) = F_yy (x, y)^{-1} F_yx (x, y)
505  BaseVector neg_Y_x(m * n);
506  Base logdet;
507  int signdet = CppAD::LuSolve(m, n, F_yy, F_yx, neg_Y_x, logdet);
508  if( signdet == 0 )
509  return signdet;
510
511  // compute hes = F_xx (x, y) + F_xy (x, y) Y^{(1)}(x)
512  for(i = 0; i < n; i++)
513  { for(j = 0; j < n; j++)
514  { hes[i * n + j] = F_hes[ i*(n+m) + j ];
515  for(k = 0; k < m; k++)
516  hes[i*n+j] -= F_hes[i*(n+m) + k+n] * neg_Y_x[k*n+j];
517  }
518  }
519  return signdet;
520 }
521
523
524 # endif
Check that exp is true, if not print msg and terminate execution.
Class used to hold function objects.
int LuSolve(size_t n, size_t m, const FloatVector &A, const FloatVector &B, FloatVector &X, Float &logdet)
Definition: lu_solve.hpp:266
int opt_val_hes(const BaseVector &x, const BaseVector &y, Fun fun, BaseVector &jac, BaseVector &hes)
Computing Jabobians and Hessians of Optimal Values.
VectorBase Jacobian(const VectorBase &x)
calculate entire Jacobian