|
|
|
Next: 4.2.9 Memory architecture
Up: 4.2 WRAPPER
Previous: 4.2.7 Distributed memory communication
Contents
4.2.8 Communication primitives
Figure 4.7:
Three performance critical parallel primitives are provided
by the WRAPPER. These primitives are always used to communicate data
between tiles. The figure shows four tiles. The curved arrows
indicate exchange primitives which transfer data between the overlap
regions at tile edges and interior regions for nearest-neighbor
tiles. The straight arrows symbolize global sum operations which
connect all tiles. The global sum operation provides both a key
arithmetic primitive and can serve as a synchronization primitive. A
third barrier primitive is also provided, it behaves much like the
global sum primitive.
|
Optimized communication support is assumed to be potentially available
for a small number of communication operations. It is also assumed
that communication performance optimizations can be achieved by
optimizing a small number of communication primitives. Three
optimizable primitives are provided by the WRAPPER
- EXCHANGE This operation is used to transfer data between
interior and overlap regions of neighboring tiles. A number of
different forms of this operation are supported. These different
forms handle
- Data type differences. Sixty-four bit and thirty-two bit
fields may be handled separately.
- Bindings to different communication methods. Exchange
primitives select between using shared memory or distributed
memory communication.
- Transformation operations required when transporting data
between different grid regions. Transferring data between faces of
a cube-sphere grid, for example, involves a rotation of vector
components.
- Forward and reverse mode computations. Derivative calculations
require tangent linear and adjoint forms of the exchange
primitives.
- GLOBAL SUM The global sum operation is a central arithmetic
operation for the pressure inversion phase of the MITgcm algorithm.
For certain configurations scaling can be highly sensitive to the
performance of the global sum primitive. This operation is a
collective operation involving all tiles of the simulated domain.
Different forms of the global sum primitive exist for handling
- Data type differences. Sixty-four bit and thirty-two bit
fields may be handled separately.
- Bindings to different communication methods. Exchange
primitives select between using shared memory or distributed
memory communication.
- Forward and reverse mode computations. Derivative calculations
require tangent linear and adjoint forms of the exchange
primitives.
- BARRIER The WRAPPER provides a global synchronization
function called barrier. This is used to synchronize computations
over all tiles. The BARRIER and GLOBAL SUM primitives
have much in common and in some cases use the same underlying code.
Next: 4.2.9 Memory architecture
Up: 4.2 WRAPPER
Previous: 4.2.7 Distributed memory communication
Contents
mitgcm-support@mitgcm.org
Copyright © 2006
Massachusetts Institute of Technology |
Last update 2011-01-09 |
|
|