How to Build and Run MPI Programs


GNUIntel

module

gcc/V.R/openmpi

gcc/V.R/mvapich

intel/YY/openmpi4

intel/YY/mvapich

Notes

Where V.R is the version and release, like in

module load gcc/9.2/openmpi

You can also use gcc/V.R/openmpi-V.R.P, like in

module load gcc/9.2/openmpi-9.2.0

Where YY is the version (year), like in

module load intel/21/mvapich

You can also use intel/YY/mvapich-YY.R, like in

module load intel/21/mvapich-21.4

Do not use the MPI implementation distributed by Intel, as it fails to run on Hydra (IB incompatibility).

Hence do not try to use mpiicc, mpiicpc, mpiifort

Examples location

/home/hpc/examples/mpi/openmpi/gcc

/home/hpc/examples/mpi/mvapich/gcc

/home/hpc/examples/mpi/openmpi4/intel

/home/hpc/examples/mpi/mvapchi/intel


PGINVIDIA

module

pgi/YY/openmpi

pgi/YY/opnempi4

pgi/YY/mvapich

nvidia/YY/openmpi

nvidia/YY/opnempi4

nvidia/YY/mvapich

Notes

Where YY is the version (year), like in

module load pgi/19/mvapich

You can also use pgi/YY/mvapich-YY.R, like in

module load pgi/19/mvapich-19.9

Where YY is the version(year), like in

module load nvidia/21/mvapich

You can also use nvidia/YY/mvapich-YY.R, like in

module load nvidia/21/mvapich-21.9

Examples location/home/hpc/examples/pgi/home/hpc/examples/nvidia

Note

  • MPI jobs must request either the orte or the mpich parallel environment with the number of slots (CPUs, computing elements, etc) 

    • OpenMPI uses orte, MVAPICH uses mpich,
    • except that pgi/NVIDIA vendor supplied openmpi uses/needs mpich.
  • The job script should use the environment variable NSLOTS (via $NSLOTS) to access the assigned number of CPUs (slots),
    that number should not be hardwired.
  • the list of nodes set aside for your MPI job is compiled by the jobs scheduler (GE) and passed to the job script via a machine list file
    that file is either $PE_HOSTFILE or $TMPDIR/machines

  • Whether you build or run an MPI program, you must first load the corresponding module, before invoking mpirun.

  • I recommend to log what computes nodes your MPI job is using with commands next to "Info"


    ORTEMPICH
    qsub-pe orte N-pe mpich N
    Info

    echo using $NSLOTS slots on:

    cat $PE_HOSTFILE

    echo using $NSLOTS slots on:

    sort $TMPDIR/machines | uniq -c

    modulemodule load XXXmodule load XXX
    runmpirun -np $NSLOTS  ./codempirun -np $NSLOTS -machinefile $TMPDIR/machines ./code

    where XXX is the right module, and  N is the number of slots you want your code to use,

    • it can also be specified as "N-M", meaning at least N and at most M CPUs (slots, ...)
  • (warning) This can be confusing, so look at the examples for the compiler/mpi-flavor you use. (warning)

(lightbulb) You can find more information under Submitting Distributed Parallel Jobs with Explicit Message Passing.


Last updated SGK

  • No labels