Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI-4: Broken communicator out of MPI_Comm_create_from_group() #10449

Closed
dalcinl opened this issue Jun 4, 2022 · 6 comments
Closed

MPI-4: Broken communicator out of MPI_Comm_create_from_group() #10449

dalcinl opened this issue Jun 4, 2022 · 6 comments

Comments

@dalcinl
Copy link
Contributor

dalcinl commented Jun 4, 2022

I'm using current git main branch, configuring with debug arguments and --with-<dep>=internal for all <dep>s.

$ git submodule status
 26ff1684378e1c5859e377249a3e78f287b04216 3rd-party/openpmix (v1.1.3-3523-g26ff1684)
 47b5ad1653123e984a644ef84b9a7aaf4e9b5fb3 3rd-party/prrte (psrvr-v2.0.0rc1-4355-g47b5ad1653)

Looks like something weird is going on with MPI_Comm_create_from_group().

This is a trivial reproducer using mpi4py. Clarification: The C-level call to MPI_Comm_create_from_group() uses arguments stringtag="org.mpi4py", info=MPI_INFO_NULL, and errhandler=MPI_ERRORS_RETURN.

from mpi4py import MPI
group = MPI.COMM_SELF.Get_group()
comm = MPI.Intracomm.Create_from_group(group)
cval = MPI.Comm.Compare(MPI.COMM_SELF, comm)
assert cval == MPI.CONGRUENT

However, the final assert line is never reached, instead I get:

$ python /tmp/test.py 
Traceback (most recent call last):
  File "/tmp/test.py", line 4, in <module>
    cval = MPI.Comm.Compare(MPI.COMM_SELF, comm)
  File "mpi4py/MPI/Comm.pyx", line 111, in mpi4py.MPI.Comm.Compare
mpi4py.MPI.Exception: MPI_ERR_ARG: invalid argument of some other kind

MPI_Comm_compare() fails on the output communicator from MPI_Comm_create_from_group().

@jsquyres jsquyres added this to the v5.0.0 milestone Jun 6, 2022
@jsquyres
Copy link
Member

jsquyres commented Jun 6, 2022

@hppritcha is likely the one we need to look at this, but he's out this week.

@hppritcha
Copy link
Member

taking a look at this....

hppritcha added a commit to hppritcha/ompi that referenced this issue Jun 21, 2022
comms.

related to open-mpi#10449
related to open-mpi#9097

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Jun 22, 2022
comms.

related to open-mpi#10449
related to open-mpi#9097

Signed-off-by: Howard Pritchard <[email protected]>
(cherry picked from commit bfffe99)
@hppritcha
Copy link
Member

closed via #10495 and #10502

@dalcinl
Copy link
Contributor Author

dalcinl commented Jun 28, 2022

@hppritcha Looks like you fixed the issue for INTRAcommunicators, but INTERcommunicators are still broken.

Reproducer:

git clone --branch  update/MPI4-openmpi https://github.com/mpi4py/mpi4py.git
cd mpi4py
make # should build in place, assumes `mpicc` can be found via $PATH
mpiexec -n 2 python test/main.py -i test_comm_inter -k testCreateFromGroups

Output:

...
======================================================================
ERROR: testCreateFromGroups (test_comm_inter.TestIntercomm)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/dalcinl/Devel/mpi4py/test/test_comm_inter.py", line 76, in testCreateFromGroups
    ccmp = MPI.Comm.Compare(self.INTERCOMM, intercomm)
  File "mpi4py/MPI/Comm.pyx", line 111, in mpi4py.MPI.Comm.Compare
mpi4py.MPI.Exception: MPI_ERR_ARG: invalid argument of some other kind
...

@hppritcha
Copy link
Member

there are some known problems with the intercomm_create_from_groups function that need to be fixed. I'll open a separate issue to track that.

@dalcinl
Copy link
Contributor Author

dalcinl commented Jun 28, 2022

I'll open a separate issue to track that.

Please mention me in the new issue, so I can keep track of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants