Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent crashes with system OpenMPI #519

Closed
vchuravy opened this issue Nov 11, 2021 · 4 comments
Closed

Silent crashes with system OpenMPI #519

vchuravy opened this issue Nov 11, 2021 · 4 comments

Comments

@vchuravy
Copy link
Member

@luraess, @christophernhill and I have been noticing silent crashes with system OpenMPI on Satori, and locally.

during build.jl Julia would simply exit. We have to do something like:

julia --project -e 'using Libdl; p=dlopen("libmpi", RTLD_LAZY; throw_error=false); p=dlopen("libmpi", RTLD_LAZY; throw_error=false); using Pkg; Pkg.add("MPI"); Pkg.build("MPI"; verbose=true)'

@christophernhill found https://www.open-mpi.org/faq/?category=troubleshooting#missing-symbols and IIRC we never got past

include(joinpath("..","src","implementations.jl"))
and
const MPI_LIBRARY_VERSION_STRING = Get_library_version()

Haven't had a chance to dig into this more.

@simonbyrne
Copy link
Member

Ah: we do this at __init__()

MPI.jl/src/MPI.jl

Lines 62 to 67 in 9b69d79

@static if Sys.isunix()
# need to open libmpi with RTLD_GLOBAL flag for Linux, before
# any ccall cannot use RTLD_DEEPBIND; this leads to segfaults
# at least on Ubuntu 15.10
Libdl.dlopen(libmpi, Libdl.RTLD_LAZY | Libdl.RTLD_GLOBAL)
end

but not by deps/build.jl.

Previous issues/changes:

@simonbyrne
Copy link
Member

Actually we now do this:
74f31f2

Are you using the latest version of MPI.jl?

@luraess
Copy link
Contributor

luraess commented Nov 24, 2021

The issue occurred with v0.18.2. However, with fresh install today it did no longer occur with v0.18.2, nor using v0.19.1. Don't know what changed so far to prevent the bug.

@simonbyrne
Copy link
Member

Okay, will close for now, but please reopen if you see it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants