Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenFOAM MPI issues #24

Closed
bedroge opened this issue Sep 30, 2020 · 7 comments
Closed

OpenFOAM MPI issues #24

bedroge opened this issue Sep 30, 2020 · 7 comments

Comments

@bedroge
Copy link
Collaborator

bedroge commented Sep 30, 2020

Saw this in the output of the OpenFOAM test:

mpirun: Error: unknown option "-ppn"
Type 'mpirun --help' for usage.

Triggered by:

mpirun -np $NP -ppn $PPN -hostfile hostlist snappyHexMesh -parallel -overwrite 2>&1 | tee log.snappyHexMesh

The script does just continue.

@bedroge
Copy link
Collaborator Author

bedroge commented Sep 30, 2020

Something is wrong with that command anyway, just using mpirun -np 2 snappyHexMesh -overwrite -parallel does not work either because of this -parallel flag:

--> FOAM FATAL ERROR: 
Trying to use the dummy Pstream library.
This dummy library cannot be used in parallel mode

@bedroge
Copy link
Collaborator Author

bedroge commented Oct 1, 2020

Okay, this is weird. Without the -parallel flag, using mpirun doesn't make sense (you just get duplicate output). But looking at the file that runs the original motorBike tutorial (/cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/OpenFOAM-8/tutorials/incompressible/simpleFoam/motorBike/Allrun), they do use runParallel for snappyHexMesh:

#!/bin/sh
cd ${0%/*} || exit 1    # Run from this directory

# Source tutorial run functions
. $WM_PROJECT_DIR/bin/tools/RunFunctions

# Copy motorbike surface from resources directory
cp $FOAM_TUTORIALS/resources/geometry/motorBike.obj.gz constant/triSurface/
runApplication surfaceFeatures

runApplication blockMesh

runApplication decomposePar -copyZero
runParallel snappyHexMesh -overwrite

runParallel patchSummary
runParallel potentialFoam
runParallel $(getApplication)

runApplication reconstructParMesh -constant
runApplication reconstructPar -latestTime

@bedroge
Copy link
Collaborator Author

bedroge commented Oct 1, 2020

@boegel
I looked into this a bit more. The Pstream library can be found in both the mpi and dummy folders:

$ ls /cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/OpenFOAM-8/platforms/linux64GccDPInt32Opt/lib/mpi/
libPstream.so  libptscotchDecomp.so

$ ls /cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/OpenFOAM-8/platforms/linux64GccDPInt32Opt/lib/dummy/

According to some issues I've found while searching for the error, this could be related to the order of the directories in the $LD_LIBRARY_PATH.

$ echo $LD_LIBRARY_PATH
/cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/OpenFOAM-8/platforms/linux64GccDPInt32Opt/lib/mpi:/cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/ThirdParty-8/platforms/linux64GccDPInt32/lib/mpi:/home/bob/OpenFOAM/bob-8/platforms/linux64GccDPInt32Opt/lib:/cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/site/8/platforms/linux64GccDPInt32Opt/lib:/cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/OpenFOAM-8/platforms/linux64GccDPInt32Opt/lib:/cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/ThirdParty-8/platforms/linux64GccDPInt32/lib:/cvmfs/pilot.eessi-hpc.org/2020.09/software/x86_64/intel/broadwell/software/OpenFOAM/8-foss-2020a/OpenFOAM-8/platforms/linux64GccDPInt32Opt/lib/dummy

The first item is the mpi folder... I don't get it...

@bedroge
Copy link
Collaborator Author

bedroge commented Oct 1, 2020

Ah, maybe it's the RPATH in the executable which is messing things up? I do see the dummy folder in there, but not the mpi folder.

@bedroge
Copy link
Collaborator Author

bedroge commented Oct 1, 2020

Yep, that's the issue: I've used patchelf to remove this dummy folder from the executable's RPATH, and now it works fine... So now the question is: how does this directory end up in the RPATH / why doesn't the mpi folder get added to the RPATH?

@bedroge bedroge changed the title mpirun error for OpenFOAM test script OpenFOAM MPI issues Oct 2, 2020
@bedroge
Copy link
Collaborator Author

bedroge commented Oct 2, 2020

Some other installation framework makes symlinks in the lib dir to all files in the lib/mpi dir, see:
https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/openfoam/common/spack-Allwmake#L33

Since lib appears before lib/dummy in the RPATH, this should fix the issue. I'm now testing this with a modified easyblock...

@bedroge
Copy link
Collaborator Author

bedroge commented Oct 14, 2020

This has been fixed in the easyblock (see easybuilders/easybuild-easyblocks#2196). For the current installation, I've created these symlinks manually.

@bedroge bedroge closed this as completed Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant