-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address Github issue #11532 by translating legacy parameters for direct launches #11854
Conversation
Requires PRRTE PR: openpmix/prrte#1777 and OpenPMIX PR: openpmix/openpmix#3133 |
bot:ibm:retest |
Hmmmm....why would an MPI process care about PRRTE params? It has no involvement with PRRTE, so the params will just be ignored. You certainly would want the PMIx values as the MPI proc uses that library - but I cannot see any reason to handle PRRTE params. |
OK - I had a feeling that might be the case. I'll update this PR to just look for PMIX params and remove the PRRTE part. I'll also update the PR for PRRTE to just have the refactor of the schizo ompi file and remove the other work that creates the PRRTE framework header. |
b587628
to
f552b2c
Compare
Revised to just check PMIX params, and to use the existing PMIX framework header. |
bot:nvidia:retest |
1 similar comment
bot:nvidia:retest |
CI cannot find the file:
I saw in openpmix/openpmix#3133 we expected it to be in ${PREFIX}/include/pmix/src/include. I'm not sure what the disconnect is here, do we search PREFIX? |
🤔 It's installed when I compile against the latest OpenPMIX. I'll try against the internal PMIX submodule and see what I find. We might need to bump the submodule pointer? |
It looks like pmix_frameworks.h should be generated by autogen in openpmix. That link should correspond to the pinned version on ompi main, in fact we are currently on the commit that introduced that change (openpmix/openpmix@22fe51c) |
You are probably hitting an issue between when you build OMPI against an internal vs external version of PMIx. The external reference should be just |
Yeah, I confirmed that the problem is in the #include "src/include/pmix_frameworks.h" Your configure code already adds the necessary |
…for direct launches Borrow code from the OMPI schizo module in PRRTE that translates legacy MCA parameters when an application is direct launched (PRRTE will translate legacy parameters when natively launched). Signed-off-by: Quincey Koziol <[email protected]>
@rhc54 could you review this ? |
@qkoziol can you open up the v5.0 cherrypick |
PR open-mpi#11854 introduced a big that causes singleton runs to segfault at startup in some cases this bug was rooted out by the github action in PR open-mpi#12217 Signed-off-by: Howard Pritchard <[email protected]>
PR open-mpi#11854 introduced a big that causes singleton runs to segfault at startup in some cases this bug was rooted out by the github action in PR open-mpi#12217 Signed-off-by: Howard Pritchard <[email protected]>
PR open-mpi#11854 introduced a big that causes singleton runs to segfault at startup in some cases this bug was rooted out by the github action in PR open-mpi#12217 Signed-off-by: Howard Pritchard <[email protected]>
PR open-mpi#11854 introduced a big that causes singleton runs to segfault at startup in some cases this bug was rooted out by the github action in PR open-mpi#12217 Signed-off-by: Howard Pritchard <[email protected]> (cherry picked from commit 86a05c1)
PR open-mpi#11854 introduced a big that causes singleton runs to segfault at startup in some cases this bug was rooted out by the github action in PR open-mpi#12217 Signed-off-by: Howard Pritchard <[email protected]> (cherry picked from commit 86a05c1) bot:notacherrypick
PR open-mpi#11854 introduced a big that causes singleton runs to segfault at startup in some cases this bug was rooted out by the github action in PR open-mpi#12217 Signed-off-by: Howard Pritchard <[email protected]> (cherry picked from commit 86a05c1) bot:notacherrypick
Borrow code from the OMPI schizo module in PRRTE that translates legacy MCA parameters when an application is direct launched (PRRTE will translate legacy parameters when natively launched).
Addresses issue #11532