4. Coding for MITgcm

4.1. Build Tools

Many Open Source projects use the "GNU Autotools" to help streamline the build process for various Unix and Unix-like architectures. For a user, the result is the common "configure" (that is, "./configure && make && make install") commands. For MITgcm, the process is similar. Typical commands are:

  $ genmake2 -mods=../code
  $ make depend
  $ make

The following sections describe the individual steps in the build process.

4.1.1. The genmake2 Utility

(Note: the older genmake has been replaced by genmake2)

The first step in any MITgcm build is to create a Unix-style Makefile which will be parsed by make to specify how to compile the MITgcm source files. For more detailed descriptions of what the make tools are and how they are used, please see:

Genmake can often be invoked successfully with a command line as simple as:

  $ genmake2 -mods=../code

However, some systems (particularly commercial Unixes that lack a more modern "/bin/sh" implementation or that have shells installed in odd locations) may require an explicit shell invocation such as one of the following:

  $ /usr/bin/sh genmake2 -make=gmake  -mods=../code
  $ /opt/gnu/bin/bash genmake2 -ieee -make=/usr/local/bin/gmake -mods=../code

The genmake2 code has been written in a Bourne and BASH (v1) compatible syntax so it should work with most "sh" and all recent "bash" implementations.

As the name implies, genmake2 generates a Makefile. It does so by first parsing the information supplied from the following sources

  1. a gemake_local file in the current directory

  2. directly from command-line options

  3. an "options file" as specified by the command-line option -optfile='FILENAME'

  4. a packages.conf file (in the current directory or in one of the "MODS" directories, see below) which contains the specific list of packages to compile

then checking certain dependency rules (the package dependencies), and finally writing a Makefile based upon the source code that it finds. For convenience within various Unix shells, genmake2 supports both "long"- and "short"-style options. A complete list of the available options can be obtained from:

  $ genmake2 -help

The most important options for genmake2 are:

--optfile=/PATH/FILENAME

This specifies the "options file" that should be used for a particular build. The options file is a convenient and machine-indepenent way of specifying parameters such as the FORTRAN compiler (FC=), FORTRAN compiler optimization flags (FFLAGS=), and the locations of various platform- and/or machine-specific tools (eg. MAKEDEPEND=). As with genmake2, all options files should be written to be compatible with Bourne--shell ("sh" or "BASH v1") syntax. Examples of various options files can be found in $ROOTDIR/tools/build_options.

If no "optfile" is specified (either through the command lin or the environment variable), genmake2 will try to make a reasonable guess from the list provided in $ROOTDIR/tools/build_options. The method used for making this guess is to first determine the combination of operating system and hardware (eg. "linux_ia32") and then find a working Fortran compiler within the user's path. When these three items have been identified, genmake2 will try to find an optfile that has a matching name.

Everyone is encouraged to submit their options files to the MITgcm project for inclusion (please send to ). We are particularly grateful for options files tested on new or unique platforms!

-adof=/path/to/file, -adoptfile=/path/to/file

This option specifies the "adjoint" or automatic differentiation options file to be used. The file is analogous to the "optfile" defined above but it specifies information for the AD build process. The default file is located in $ROOTDIR/tools/adjoint_options/adjoint_default and it defines the "TAF" and "TAMC" compilers. An alternate version is also available at $ROOTDIR/tools/adjoint_options/adjoint_staf that selects the newer "STAF" compiler. As with any compilers, it is helpful to have their directories listed in your $PATH environment variable.

-mods=DIR, -mods='DIR1 [DIR2 ...]'

This option specifies a list of directories containing "modifications". These directories contain files with names that may (or may not) exist in the main MITgcm source tree but will be overridden by any identically-named sources within the "MODS" directories. The order of precedence for this "name-hiding" is as follows:

  • "MODS" directories (in the order given)

  • Packages either explicitly specified or provided by default (in the order given)

  • Packages included due to package dependencies (in the order that that package dependencies are parsed)

  • The "standard dirs" (which may have been specified by the "-standarddirs" option)

-pgroups=/PATH/FILENAME

This option specifies the file where package groups are defined. If not set, the package-groups definition will be read from $ROOTDIR/pkg/pkg_groups.

It also contains the default list of packages (defined as the group "default_pkg_list") which is used when no specific package list (file: packages.conf) is found in current directory or in any "MODS" directory.

-pdepend=/PATH/FILENAME

This specifies the dependency file used for packages. If not specified, the default dependency file is $ROOTDIR/pkg/pkg_depend. The syntax for this file is parsed on a line-by-line basis where each line containes either a comment ("#") or a simple "PKGNAME1 (+|-)PKGNAME2" pairwise rule where the "+" or "-" symbol specifies a "must be used with" or a "must not be used with" relationship, respectively. If no rule is specified, then it is assumed that the two packages are compatible and will function either with or without each other.

-make=/path/to/gmake

Due to the poor handling of soft-links and other bugs common with the make versions provided by commercial Unix vendors, GNU make (sometimes called gmake) should be preferred. This option provides a means for specifying the make program to be used.

A successful run of genmake2 will produce a Makefile, a PACKAGES_CONFIG.h file, and various convenience files used for the automatic differentiation process.

In general, it is best to use genmake2 on a "clean" directory that is free of all source (*.[F,f],*.[F,f]90) and header (*.h,*.inc) files. Generally, this can be accomplished in an "un-clean" directory by running "make Clean" followed by "make makefile".

4.1.2. Using the Makefile

Once a Makefile has been created using genmake2, one can build a "standard" (forward simulator) executable using:

  $ make Clean
  $ make depend
  $ make

The "make Clean" step will remove any stale source files, include files, and links. It is strongly recommended for "un-clean" directories which may contain the (perhaps partial) results of previous builds. Such "debris" can interfere with the next stage of the build. A more agressive cleaning option, "make CLEAN", can be used to also remove the executable and output files from a previous run.

The "make depend" step will create a large number of symbolic links from the local directory to the source file locations. It also parses these files and creates an extensive list of dependencies within the Makefile itself. The links that exist at this stage are mostly "large F" files (*.F and *.F90) that need to be processed by a C preprocessor ("CPP"). Since "make depend" edits the Makefile, it is important not to skip this step!

The final "make" invokes the C preprocessor to produce the "little f" files (*.f and *.f90) and then compiles them to object code using the specified FORTRAN compiler and options. An intermediate script is often used during this stage to further process (usually, make simple substitutions) custom definitions such as variable types within the source files. This additional stage is necessary in order to overcome some of the inconsistencies in the sizes of objects (bytes) between different compilers. The result of the build process is an executable with the name mitgcmuv.

In addition to the forward simulator described above, the Makefile also has a number of targets that can be used to produce various adjoint and tangent-linear builds for optimization and other parameter-sensitivity problems. The additional targets within the Makefile are:

make adall

This target produces an mitgcmuv_ad executable using the taf or staf adjoint compiler. See the genmake2 "-adof" option for compiler selection.

make ftlall

Similar to make adall above, this produces...

Please report any compilation failures or other build problems to the list.

4.2. The Verification Suite

The MITgcm CVS tree (within the $ROOTDIR/verification/ directory) includes many (> 90) examples intended for regression testing. Each one of these test-experiment directories contains "known-good" output files along with all the input (including both code and data files) required for their re-calculation. Also included in $ROOTDIR/verification/ is the shell script testreport to perform regression tests.

4.2.1. Test-experiment Directory Content

Each test-experiment directory (TESTDIR) contains several standard subdirectories and files which testreport recognizes and uses when running a regression test. The directories/files that testreport uses are different for a forward test and an adjoint test (testreport -adm) and some test-experiments are set-up for only one type of regression test whereas others allow both types of tests (forward and adjoint).

Also some test-experiment allows, using the same MITgcm executable, to perform multiple tests using different parameters and input files, with a primary input set-up (input/ or input_ad/ ) and corresponding results (results/output.txt or results/output_adm.txt) and with one or several secondary inputs (input.OTHER/ or input_ad.OTHER/ ) and corresponding results (results/output.OTHER.txt or results/output_adm.OTHER.txt).

directory TESTDIR/results/

contains reference standard output used for test comparison. results/output.txt and results/output_adm.txt correspond respectively to primary forward and adjoint test run on the reference platform (currently baudelaire.csail.mit.edu) on one processor (no MPI, single thread) using the reference compiler (curently the GNU fortran compiler gfortran). The presence of these files determines whether or not testreport is testing or skipping this test-experiment. Reference standard output for secondary tests (results/output.OTHER.txt or results/output_adm.OTHER.txt) are also expected here.

The test comparison involves few model variables output, which are, by default and for a forward test, the 2-D solver initial residual (cg2d_init_res) and 3-D state variables (T,S,U,V) monitor output, and, by default and for an adjoint test, the cost-function and gradient-check. However, some test-experiments use some package-specific variable/monitor output according to the file TESTDIR/input[_ad][.OTHER]/tr_checklist specification.

directory TESTDIR/build/

initially empty directory where testreport will build the MITgcm executable for forward and adjoint test. It might contains an experiment specific genmake_local file (see Section 4.1.1).

Note that the original code[_ad]/SIZE.h_mpi is not directly used as "SIZE.h" to build an MPI-executable ; instead, a local copy build/SIZE.h.mpi is derived from code[_ad]/SIZE.h_mpi by adjusting the number of processors (nPx,nPy) according to NUMBER_OF_PROCS (see Section 4.2.2, testreport -MPI) ; then it is linked to "SIZE.h" ( ln -s SIZE.h.mpi SIZE.h ) before building the MPI-executable.

directory TESTDIR/code/

contains the test-experiment specific source code used to build the MITgcm executable (mitgcmuv) for forward-test (using genmake2 -mods=../code).

It can also contain specific source files with the suffix "_mpi" to be used in place of the corresponding file (without suffix) for an MPI test (see Section 4.2.2). The presence or absence of SIZE.h_mpi determines whether or not an MPI test on this test-experiment is performed or skipped.

directory TESTDIR/code_ad/

contains the test-experiment specific source code used to build the MITgcm executable (mitgcmuv_ad) for adjoint-test (using genmake2 -mods=../code_ad). It can also contain specific source files with the suffix "_mpi" (see above).

directory TESTDIR/input/

contains the input and parameter files used to run the primary forward test of this test-experiment.

It can also contain specific parameter files with the suffix ".mpi" to be used in place of the corresponding file (without suffix) for MPI test, or with suffix ".mth" to be used for multi-threaded test (see Section 4.2.2). The presence or absence of eedata.mth determines whether or not a multi-threaded test on this test-experiment is performed or skipped.

To save disk space and reduce downloading time, multiple copies of the same input file is avoided by using a shell script prepare_run. When such a script is found in TESTDIR/input/ , testreport run this script in directory TESTDIR/run/ after linking all the input file from TESTDIR/input/ .

directory TESTDIR/input_ad/

contains the input and parameter files used to run the primary adjoint test of this test-experiment. It can also contain specific parameter files with the suffix ".mpi" and shell script prepare_run as described above.

directory TESTDIR/input.OTHER/

contains the input and parameter files used to run the secondary OTHER forward test of this test-experiment. It can also contain specific parameter files with suffix ".mpi" or ".mth" and shell script prepare_run (see above).

The presence or absence the file eedata.mth determines whether or not a secondary multi-threaded test on this test-experiment is performed or skipped.

directory TESTDIR/input_ad.OTHER/

contains the input and parameter files used to run the secondary OTHER adjoint test of this test-experiment. It can also contain specific parameter files with the suffix ".mpi" and shell script prepare_run (see above).

directory TESTDIR/run/

initially empty directory where testreport will run the MITgcm executable for primary forward and adjoint test.

Symbolic links (using command "ln -s") are made for input and parameter files (from ../input/ or from ../input_ad/ ) and for MITgcm executable (from ../build/ ) before the run proceeds. The sequence of links (function linkdata within shell script testreport) for a forward test is:

* link+rename or remove links
       to special files with suffix ".mpi" or ".mth" from ../input/
* link files from ../input/
* execute ../input/prepare_run (if it exists)

The sequence for an ajoint test is similar, with ../input_ad/ replacing ../input/ .

directory TESTDIR/tr_run.OTHER/

directory created by testreport to run the MITgcm executable for secondary "OTHER" forward or adjoint test.

The sequence of links for a forward secondary test is:

* link+rename or remove links
       to special files with suffix ".mpi" or ".mth" from ../input.OTHER/
* link files from ../input.OTHER/
* execute ../input.OTHER/prepare_run (if it exists)
* link files from ../input/
* execute ../input/prepare_run (if it exists)

The sequence for an ajoint test is similar, with ../input_ad.OTHER/ and ../input_ad/ replacing ../input.OTHER/ and ../input/ .

4.2.2. The testreport Utility

The shell script testreport (in $ROOTDIR/verification/), which was written to work with genmake2, can be used to build different versions of the MITgcm code, run the various examples, compare the output, and (if specified) email the results of each one of these tests to a central repository.

On some systems, the testreport script can be run with a command line as simple as:

  $ cd verification
  $ ./testreport

However, some systems (those lacking or wiht a broken "/bin/sh") may require an explicit shell invocation such as:

  $ sh ./testreport -t 'exp2 exp4'
  $ /some/path/to/bash ./testreport -t 'ideal_2D_oce lab_sea natl_box'

The testreport script accepts a number of command-line options which can be listed using the -help option. The most important ones are:

-ieee (default) / -noieee

If allowed by the compiler (as defined in the "optfile"), use IEEE arithmetic (genmake2 -ieee). This option, along with the gfortran / gcc compiler, is how the standard results are produced.

-optfile=/PATH/FILENAME, -optfile '/PATH/F1 [/PATH/F2 ...]'

This specifies a list of "options files" that will be passed to genmake2. If multiple options files are used (say, to test different compilers or different sets of options for the same compiler), then each options file will be used with each of the test directories.

-tdir TESTDIR, -tdir 'TDIR1 TDIR2 [...]'

This option specifies the test directory or list of test directories that should be used. Each of these entries should exactly (note: they are case sensitive!) match the names of directories in $ROOTDIR/verification/. If this option is omitted, then all directories that are properly formatted (that is, containing an input sub-directory and a results/output.txt file) will be used.

-addr EMAIL, -addr 'EMAIL1 EMAIL2 [...]'

Send the results (namely, output.txt, genmake_local, genmake_state, and Makefile) to the specified email addresses. The results are gzipped, placed in a tar file, MIME encoded, and sent to the specified address. If no email addresses are specified, no mail is sent.

-MPI NUMBER_OF_PROCS, -mpi

If the necessary file (TESTDIR/code/SIZE.h_mpi) exists, then use it (and all TESTDIR/code/*_mpi files) for an MPI--enabled run. The new option (-MPI followed by the maximum number of processors) enable to build and run each test-experiment using variable number of MPI processors (multiple of nPx*nPy from TESTDIR/code/SIZE.h_mpi and not larger than NUMBER_OF_PROCS). The short option ("-mpi") can only be used to build and run on 2 MPI processors (equivalent to "-MPI 2").

Note that the use of MPI typically requires a special command option (see "-command" below) to invoke the MPI executable. Examples of PBS scripts using testreport with MPI can be found in the tools/example_scripts directory.

-command='some command to run'

For some tests, particularly MPI runs, a specific command might be needed to run the executable. This option allows a more general command (or shell script) to be invoked. Examples of PBS scripts using testreport with MPI can be found in the tools/example_scripts directory.

For the case where the number of MPI processors varies according to each test-experiment, some key-words within the command-to-run argument will be replaced by their effective value:

TR_NPROC will be replaced by the actual number of MPI processors needed to run the current test-experiment.

TR_MFILE will be replaced by the name of local-file that testreport creates from the full list of machines which "testreport -mf MACHINE_FILE" provides, but truncated to the exact number of machines.

-mf MACHINE_FILE

To use with -MPI NUMBER_OF_PROCS option, to specify the file containing the full list of NUMBER_OF_PROCS machines to use for the MPI runs.

-mth

compile (with genmake2 -omp) and run with multiple threads (using eedata.mth).

The testreport script will create an output directory tr_NAME_DATE_N/ , with hostname as default NAME, DATE the current date followed by a suffix number "N" to distinguish from previous testreport output directories. testreport writes progress to the screen (stdout) and reports into the ouput directory as it runs. In particular, one can find, in the ouput directory, the summary.txt file that contains a brief comparison of the current output with the "known-good" output. At the end of the testing process, the tr_out.txt file is generated in $ROOTDIR/verification/ as a compact version of summry.txt file.

4.2.3. The do_tst_2+2 Utility

The shell script do_tst_2+2 (in $ROOTDIR/tools/ ) can be used to check the accuracy of the restart procedure.

4.3. Creating MITgcm Packages

Optional parts of code have been separated from the MITgcmUV core driver code and organised into packages. The packaging structure provides a mechanism for maintaining suites of code, specific to particular classes of problems, in a way that is cleanly separated from the generic fluid dynamical engine.

The MITgcmUV packaging structure is described below using generic package names ${pkg}. A concrete examples of a package is the code for implementing GM/Redi mixing. This code uses the package name