Skip to content

RonRahaman/openacc-mpi-demos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenACC/MPI Demos

These demos use OpenACC and MPI to run SAXPY in various contexts.

Contents

  • saxpy_acc runs a SAXPY in parallel on a single GPU.

  • saxpy_mpi runs SAXPY on CPU with multiple MPI processes. The root allocates an array for the global problem, then scatters local subarrays to the other ranks. Then, each rank computes SAXPY on its subarray. Finally, the subarrays are gathered back onto the root to check the results. The length of the global problem must be evenly divisible by the number of MPI ranks.

  • saxpy_acc_mpi runs SAXPY with multiple MPI processes. Each process offloads its local work to GPU using OpenACC kernels. The MPI scheme is the same as saxpy_mpi, and the length of the global problem must be evenly divisible by the number of MPI ranks. The number of NVIDIA-capable GPUs can be arbitrarily >= 1; each rank's kernels will be round-robin scheduled on the available GPUs.

Building

Ensure that the mpif90 in your path uses pgfortran. Then simply run make all.

Running

MPI programs are run as usual, e.g: mpirun -np 2 ./saxpy_acc_mpi

Profiling

The nvprof users' manual describes how to profile and visualize MPI sessions:

For example, you could run nvprof as the following, where %p is substituted with the corresponding process ID:

mpirun -np 2 nvprof --export-profile saxpy.%p.prof ./sgemm_acc_mpi

After copying the *.nvprof files to your local computer, you can import and visual them in nvvp.

About

Demos for multi-GPU programming with OpenACC and MPI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published