|
Prev | Next | glossary |
f
there is a corresponding
AD of
Base
operation sequence
.
This operation sequence
defines a function
F : B^n \rightarrow B^m
where
B
is the space corresponding to objects of type
Base
,
n
is the size of the domain
space, and
m
is the size of the range
space.
We refer to
F
as the AD function corresponding to
the operation sequence stored in the object
f
.
(See the FunCheck discussion
for
possible differences between
F(x)
and the algorithm that defined
the operation sequence.)
Base
object its type is
either
AD<Base>
(see the default and copy constructors
or
VecAD<Base>::reference
(see VecAD
)
for some
Base
type.
Base
is a type,
the AD levels above
Base
is the following sequence of types:
AD<Base> , AD< AD<Base> > , AD< AD< AD<Base> > > , ...
f : B \rightarrow B
is referred to as a
Base
function,
if
Base
is a C++ type that represent elements of
the domain and range space of
f
; i.e. elements of
B
.
x
is an
AD<Base>
object,
Base
is referred to as the base type for
x
.
e^j \in B^m
is defined by
\[
e_i^j = \left\{ \begin{array}{ll}
1 & {\rm if} \; i = j \\
0 & {\rm otherwise}
\end{array} \right.
\]
Type
operation is an operation that
has a
Type
result and is not made up of other
more basic operations.
Type
operations
is called a
Type
operation sequence.
A sequence of atomic AD of Base
operations
is referred to as an AD of
Base
operation sequence.
The abbreviated notation AD operation sequence is often used
when it is not necessary to specify the base type.
x
and
y
are
Type
objects and
the result of
x < y
has type bool (where
Type
is not the same as bool).
If one executes the following code
if( x < y )
y = cos(x);
else y = sin(x);
the choice above depends on the value of
x
and
y
and the two choices result in a different
Type
operation sequence.
In this case, we say that the
Type
operation sequence depends
on
x
and
y
.
i
and
n
are size_t objects,
and
x[i]
,
y
are
Type
objects,
where
Type
is different from size_t.
The
Type
sequence of operations corresponding to
y = Type(0);
for(i = 0; i < n; i++)
y += x[i];
does not depend on the value of
x
or
y
.
In this case, we say that the
Type
operation sequence
is independent of
y
and the elements of
x
.
Base
objects are parameters.
An
AD<Base>
object
u
is currently a parameter if
its value does not depend on the value of
an Independent
variable vector for an
active tape
.
If
u
is a parameter, the function
Parameter(u)
returns true
and Variable(u)
returns false.
bool
can represent a vector of sets using one bit per element.
(Some vectors of bool use one byte per element but
vectorBool
is an example class that
uses one bit per element.)
The problem is that this representation uses one bit for both the elements
that are there and the ones that are not.
A vector of std::set<size_t> does not
represent the elements that are not present,
but it uses about three size_t values
for each element that is present.
For example, if size_t uses 32 bits,
a vector of std::set<size_t> uses
about 100 bits for each element that is present in the
vector of sets.
Thus, a vector of std::set<size_t> should be more efficient for
very sparse matrix representations.
A \in B^{m \times n}
,
a vector of bool
B
of length
m \times n
is a sparsity pattern for
A
if
for
i = 0, \ldots , m-1
and
j = 0 , \ldots n-1
,
\[
A_{i,j} \neq 0
\; \Rightarrow \;
B_{i * n + j} = {\rm true}
\]
Given two sparsity patterns
B
and
C
for a matrix
A
, we say that
B
is more efficient than
C
if
B
has fewer true elements than
C
.
For example, if
A
is the identity matrix,
\[
B_{i * n + j} = (i = j)
\]
defines an efficient sparsity pattern for
A
.
A \in B^{m \times n}
,
a vector of sets
S
of length
m
is a
sparsity pattern for
A
if
for
i = 0, \ldots , m-1
\[
A_{i,j} \neq 0
\; \Rightarrow \; j \in S_i
\]
Given two sparsity patterns
S
and
T
for a matrix
A
, we say that
S
is more efficient than
T
if
S
has fewer elements than
T
.
For example, if
A
is the identity matrix,
\[
S_i = \{ i \}
\]
defines an efficient sparsity pattern for
A
.
Independent(x)
All operations that depend on the elements of
x
are
recorded on this active tape.
ADFun<Base> f( x, y)
or using the syntax (see f.Dependent(x, y)
)
f.Dependent( x, y)
After such a transfer, the tape becomes inactive.
x
as the independent variables for the tape.
When the tape becomes inactive,
the corresponding objects become
parameters
.
X : B \rightarrow B^n
is a
is
p
times continuously differentiable function
in some neighborhood of zero.
For
k = 0 , \ldots , p
,
we use the column vector
x^{(k)} \in B^n
for the k-th order
Taylor coefficient corresponding to
X
which is defined by
\[
x^{(k)} = \frac{1}{k !} \Dpow{k}{t} X(0)
\]
It follows that
\[
X(t) = x^{(0)} + x^{(1)} t + \cdots + x^{(p)} t^p + R(t)
\]
where the remainder
R(t)
divided by
t^p
converges to zero and
t
goes to zero.
AD<Base>
object
u
is a variable if
its value depends on an independent variable vector for
a currently active tape
.
If
u
is a variable,
Variable(u)
returns true and
Parameter(u)
returns false.
For example,
directly after the code sequence
Independent(x);
AD<double> u = x[0];
the
AD<double>
object
u
is currently a variable.
Directly after the code sequence
Independent(x);
AD<double> u = x[0];
u = 5;
u
is currently a parameter
(not a variable).
Note that we often drop the word currently and
just refer to an
AD<Base>
object as a variable
or parameter.