::MITgcm cluster facility

Parallel job execution  
Home
Overview
Get an account
Login
PBS Queues
Compilers
Standard Libraries
Parallel Execution
MITgcm Examples
FAQ
Output Analysis
Storage
More Help
Hardware Layout
Technical Specs
Table Of Contents

Back Next

 

Execution of parallel codes with MPICH-GM (the MPI library built for optimized performance over the Myrinet network) requires invoking starting the code through the appropriate form of the mpirun command. The pathnames for mpirun are:

MPICH-GM compilation driver scripts on myrinet-3 and myrinet-4 nodes

Command

Description

/usr/local/pkg/mpi/mpi-1.2.4..8a-gm-1.5/pgi/bin/mpirun.ch_gm  Launches a parallel run for and MPI code compiled against the GM 1.5 libraries (myrinet-3 compilations).
/usr/local/pkg/mpi/mpi-1.2.4..8a-gm-1.5/g77/bin/mpirun.ch_gm  Launches a parallel run for and MPI code compiled against the GM 1.5 libraries (myrinet-3 compilations).
/usr/local/pkg/mpi/mpi-1.2.4..8a-gm-1.5/intel/bin/mpirun.ch_gm  Launches a parallel run for and MPI code compiled against the GM 1.5 libraries (myrinet-3 compilations).
/usr/local/pkg/mpi/mpi-1.2.5..10/lahey/bin/mpirun.ch_gm  Launches a parallel run for and MPI code compiled against the GM 1.5 libraries (myrinet-3 compilations).
 

Useful MPICH-GM mpirun.ch_gm command line arguments

--gm-kill 5

Makes sure that all processes get killed (with a five second timeout) if one of the set of processes in a parallel run dies. Note, if the wait is too short e.g. one second, then this option can leave a job hanging until the PBS wallclock timer expires.

--help

Lists command options.

-np nn

Sets number of processes to run on to nn.

-s

Allows background execution.

-v

Provides verbose output during startup.

-machinefile file

Specifies a machine file, which sets processes that run will use. A Machine file specifies a list of host names - see below for example.

-pg file

Gives a process group file that allows different executables to run together as a single MPI group - see below for example.

-wd dir

Sets the working directory in which processes execute. The myrinet-3, myrinet-4 mpirun script has been modified so that when -wd is specified the working directory for each process is set to dir/rank, where dir is the argument specified and rank is a number between 0 and nn-1. This allows each process to utilize a different directory. Different groups of processes can be made to share a set of directories by making dir/rank a link to common directories as appropriate.

Example machine file for MPICH-GM with GM1.5 (myrinet-3, myrinet-4 nodes)

myrinet-3-01
myrinet-3-02
myrinet-3-03
myrinet-3-04
myrinet-3-05
myrinet-3-06
myrinet-3-07
myrinet-3-08
myrinet-3-09
myrinet-3-10

This example specifies a 10 process setup. The machinefile format in GM1.5 is identical to the plain MPICH format.