|
|
|
Next: 5.4.2 Recipe 1: single
Up: 5.4 Adjoint dump &
Previous: 5.4 Adjoint dump &
Contents
Most high performance computing (HPC) centres require the use
of batch jobs for code execution.
Limits in maximum available CPU time and memory may prevent
the adjoint code execution from fitting into any of the available
queues. This presents a serious limit for large scale /
long time adjoint ocean and climate model integrations.
The MITgcm itself enables the split of the total model
integration into sub-intervals through standard dump/restart
of/from the full model state.
For a similar procedure to run in reverse mode,
the adjoint model requires, in addition to the model state,
the adjoint model state,
i.e. all variables with derivative information
which are needed in an adjoint restart.
This adjoint dump & restart is also termed 'divided adjoint (DIVA).
For this to work in conjunction with automatic differentiation,
an AD tool needs to perform the following tasks:
- identify an adjoint state, i.e. those sensitivities whose
accumulation is interrupted by a dump/restart and which influence
the outcome of the gradient.
Ideally, this state consists of
- the adjoint of the model state,
- the adjoint of other intermediate results (such as control variables,
cost function contributions, etc.)
- bookkeeping indices (such as loop indices, etc.)
- generate code for storing and reading adjoint state variables
- generate code for bookkeeping , i.e. maintaining a file
with index information
- generate a suitable adjoint loop to propagate adjoint values
for dump/restart with a minimum overhad of adjoint intermediate
values.
TAF (but not TAMC!)
generates adjoint code which performs the above specified
tasks. It is closely tied to the adjoint multi-level checkpointing.
The adjoint state is dumped (and restarted) at each step of the
outermost checkpointing level and adjoint intergration is performed
over one outermost checkpointing interval.
Prior to the adjoint computations, a full foward sweep is performed to
generate the outermost (forward state) tapes and to calculate
the cost function.
In the current implementation, the forward sweep is
immediately followed by the first adjoint leg.
Thus, in theory, the following steps are performed (automatically)
- 1st model call:
This is the case if file costfinal does not exist.
S/R mdthe_main_loop is called.
- calculate forward trajectory and dump model state after each
outermost checkpointing interval to files tapelev3
- calculate cost function fc and write it to file
costfinal
- 2nd and all remaining model call:
This is the case if file costfinal does exist.
S/R adthe_main_loop is called.
- (forward run and cost function call is avoided
since all values are known)
- if 1st adjoint leg:
create index file divided.ctrl which contains
info on current checkpointing index
- if not
-th adjoint leg:
adjoint picks up at
and runs to
- perform adjoint leg from
to
- dump adjoint state to file snapshot
- dump index file divided.ctrl for next adjoint leg
- in the last step the gradient is written.
A few modififications were performed in the forward code,
obvious ones such as adding the corresponding TAF-directive
at the appropriate place, and less obvious ones
(avoid some re-initializations, when in an intermediate
adjoint integration interval).
[For TAF-1.4.20 a number of hand-modifications were necessary
to compensate for TAF bugs.
Since we refer to TAF-1.4.26 onwards,
these modifications are not documented here].
Next: 5.4.2 Recipe 1: single
Up: 5.4 Adjoint dump &
Previous: 5.4 Adjoint dump &
Contents
mitgcm-support@mitgcm.org
Copyright © 2006
Massachusetts Institute of Technology |
Last update 2018-01-23 |
|
|