Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HP/Cray MPI Support (Demote MPI_CXX types to MPI_C) #60

Merged
merged 5 commits into from
Sep 30, 2022

Conversation

HugoStrand
Copy link
Contributor

Dear Pomerol developers,

In order to compile Pomerol on (modern) HP/Cray systems (e.g. https://lumi-supercomputer.eu/) I had to demote the MPI_CXX_DOUBLE_COMPLEX and MPI_CXX_BOOL types to their C equivalents MPI_C_DOUBLE_COMPLEX and MPI_C_BOOL.

With this change (and the null pointer fix) Pomerol compiles and passes tests on LUMI.

Would you please consider merging this to support this class of systems?

Best, Hugo

@krivenko krivenko self-requested a review September 28, 2022 10:34
@krivenko
Copy link
Collaborator

Okay, I would like to dig into this issue a bit further.

Does Cray's MPI implementation claim to conform to the 3.0 standard, while not actually defining the MPI_CXX_* datatypes?
What does CMake say about detected MPI?

@HugoStrand
Copy link
Contributor Author

I understand.

What I have figured out so far is that Cray's MPICH library does define the MPI_CXX_ data types, however they are all set to MPI_DATATYPE_NULL.

> egrep 0x0c0 /opt/cray/pe/mpich/8.1.18/ofi/gnu/9.1/include/mpi.h 
#define MPI_DATATYPE_NULL  ((MPI_Datatype)0x0c000000)
#if 0x4c00100c != 0x0c000000
#if 0x8c000004 != 0x0c000000
#if 0x8c000004 != 0x0c000000
#if 0x4c002042 != 0x0c000000
#define MPI_CXX_BOOL                ((MPI_Datatype)0x0c000000)
#define MPI_CXX_FLOAT_COMPLEX       ((MPI_Datatype)0x0c000000)
#define MPI_CXX_DOUBLE_COMPLEX      ((MPI_Datatype)0x0c000000)
#define MPI_CXX_LONG_DOUBLE_COMPLEX ((MPI_Datatype)0x0c000000)

@krivenko
Copy link
Collaborator

What I have figured out so far is that Cray's MPICH library does define the MPI_CXX_ data types, however they are all set to MPI_DATATYPE_NULL.

This is very nice of them, I have to say.

Anyway, I'd rather explore a different approach. Here is how C/C++ types map to MPI datatypes.

  • bool -> MPI_CXX_BOOL
  • C99's _Bool -> MPI_C_BOOL
  • std::complex<double> -> MPI_CXX_DOUBLE_COMPLEX
  • C99's double _Complex -> MPI_C_DOUBLE_COMPLEX

While std::complex<double> and double _Complex are guaranteed to have the same memory layout, I am not so sure about the boolean types. It is, therefore, a somewhat risky workaround to use MPI_C_BOOL instead of MPI_CXX_BOOL.

What I would do is to define a couple new macros, POMEROL_MPI_BOOL and POMEROL_MPI_DOUBLE_COMPLEX, which are set to either C++ or C datatypes. Now the question is how to check that the C++ datatypes are functional in the detected MPI implementation.

@aeantipov
Copy link
Collaborator

aeantipov commented Sep 28, 2022

I agree with Igor, it looks like this is a non-standard behavior of a specific machine. Indeed the proper workaround is to define Macros and probably put CMake option like "Use_MPI_C_types" with default=OFF to define these as MPI_C types.

@HugoStrand mind updating? Thanks! Good to hear from you

@krivenko
Copy link
Collaborator

@HugoStrand I have added a new CMake option, Use_MPI_C_datatypes (defaults to OFF). Please, try it now.

@HugoStrand
Copy link
Contributor Author

Dear @krivenko @aeantipov,

Thank you for the fix and suggestions. The Use_MPI_C_datatypes cmake flag work-around works perfectly. Thank you @krivenko! 🥇

I agree that this is non-standard behaviour. I have contacted the user support at LUMI and they have promised to bring it up with HPE/Cray. It seems like the MPICH developers have identified the issue as well, see pmodels/mpich#6094 . So this will hopefully not remain an issue forever.

@krivenko the support person pointed out that MPI_CXX_BOOL and MPI_C_BOOL should be binary compatible according to the MPI standard, see tab. 14.2 and 14.4 in,
https://www.mpi-forum.org/docs/mpi-4.0/mpi40-report.pdf

Greetings from Sweden,
Hugo

@krivenko krivenko merged commit 1d8d39f into pomerol-ed:master Sep 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants