Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code compiled but fails to run on the test case #16

Open
SamSalehian opened this issue Jul 9, 2021 · 2 comments
Open

Code compiled but fails to run on the test case #16

SamSalehian opened this issue Jul 9, 2021 · 2 comments

Comments

@SamSalehian
Copy link

SamSalehian commented Jul 9, 2021

I have compiled the code with OpenMPI (openmpi/4.0.3-intel-pmi2) using intel MKL (intel_19.0.4.243/mkl) on our local HPC, the code compiles without a problem and generates the executable file (ucns3d_p). However, when I tested the code to run any of the sample test cases, the code starts by creating the GRID.bnd GRID.cel GRID.vrt history.txt files, but crashes with the following errors:

 ------------------------------------------------------------------------
                        ParMETIS Initiated
 ------------------------------------------------------------------------
[dmc32:10910] *** An error occurred in MPI_Comm_rank
[dmc32:10910] *** reported by process [1088290817,1]
[dmc32:10910] *** on communicator MPI_COMM_WORLD
[dmc32:10910] *** MPI_ERR_COMM: invalid communicator
[dmc32:10910] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[dmc32:10910] ***    and potentially your MPI job)
[dmc32:10909] *** An error occurred in MPI_Comm_rank
[dmc32:10909] *** reported by process [1088290817,0]
[dmc32:10909] *** on communicator MPI_COMM_WORLD
[dmc32:10909] *** MPI_ERR_COMM: invalid communicator
[dmc32:10909] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[dmc32:10909] ***    and potentially your MPI job)
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** STEP 606430.1 ON dmc32 CANCELLED AT 2021-07-09T12:18:46 ***
srun: error: dmc32: tasks 0-1: Killed
srun: launch/slurm: _step_signal: Terminating StepId=606430.1

I would greatly appreciate any suggestions.
Makefile.txt

@TakisCFD
Copy link
Collaborator

TakisCFD commented Jul 9, 2021

try to compile with the Parmetis library found in the ARCHER lib directory.
If this does not work you need to compile Parmetis with the gcc,or intel compiler and OpenMPI and link against this one.
The crash experienced is due to the Parmetis partitioning the mesh not being completed.
Let me know if this solves the issue.

@SamSalehian
Copy link
Author

Hello Takis,

Sorry for my late response. I was testing the code on two different HPC systems. On the first one (ERAU VEGA) intel MPI compilers are available, so I compiled the code just fine and ran a couple of the test cases without any issue. On the send one (DMC ASC) only the gnu ones are available. I tried various options the ARCHER LIB or the gcc Parmetis, but unfortunately all of them failed. I also tried to compile with the parmetis module on DMC parmetis/4.0.3 as well , unfortunately all such attempts fail at parmetis initialization.

Since I had the other HPC system working with intel HPC, I gave up on compilation using with all gcc. but I was wondering if you had a specific dependency versions / a container built for gcc compilation?

Regards,

Sam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants