-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incompatibility between hwloc (internal) and system hwloc on Ubuntu 20.04 (Arm) #8878
Comments
So to be clear: you are generating an XML topology file and then asking OMPI to ingest it instead of discovering the topology on its own? If so, there really isn't anything we can do about it - HWLOC has introduced a backward compatibility break in their v2.1 series. You just need to let OMPI do its own discovery, or build OMPI against the external HWLOC. |
Sorry, I left out something quite important. The issue does not happen with mpirun, only with slurm (srun --mpi=pmix). We discovered it on a ring test, but it occurs with an MPI "hello world" (i.e. any code you run with slurm). So, MPI is fed the xml file from slurm. |
@rhc54 This is the follow-up of the weird Slurm / Open MPI direct launch issue I had pinged you about on email. Slurm is built with HWLOC 2.1.0 (because that's what is preinstalled on the system) and Open MPI 4.1 was built with its internal hwloc, and the internal can't parse the xml passed down from Slurm. I don't think there's a direct ask here yet, other than "anyone have any ideas on what we should be doing?". THere are some work items that we should do around error handling; the failure current causes Open MPI to crash, rather than have a pretty abort. |
Sigh - this makes twice that this has happened. I'll provide a separate comment below targeting a long-term solution. Meantime, there are only a couple of options:
Afraid I have no better solution to offer right now. See below for some longer-term thoughts. |
@bgoglin @artpol84 This problem (backward compatibility break in HWLOC topology strings) has bitten us before - and is likely to continue doing so if we don't make some changes. Avoiding these problems really requires that both of you do something in your respective software. I have a couple of suggestions: HWLOC really needs to provide some method in the XML string by which we can identify a break in compatibility. Compatibility breaks for a variety of reasons - for example, a change in the string format (as we saw when going from XML string format 1 to 2) or introduction of a new "component" (in this case, the "die" declaration) that the earlier library cannot understand. It's nearly impossible to guarantee that the older library will gracefully fail under all possible changes, so what we really need is something in the XML string that tells us a version. For example, you could include a field that tells us the minimum HWLOC version that can parse this string. This would allow us to check it against our version and reject it. Or you could tag the string with a format version (e.g., "hwloc-xml-3") that we could check. I'd prefer the first option, but could make either one work. Likewise, the Slurm plugin's support of HWLOC needs updating. Our solution for the prior break was to create two PMIx keys, one for the first HWLOC format and the other for the second. This was required because the strings need to be parsed differently, with different flags set so we get all the required info out of them. The Slurm plugin, however, continues to pass down the "old" undifferentiated key regardless of the HWLOC version that generated the string. Fortunately, HWLOC doesn't just barf when it gets the other string - but we don't get the info we want if the flags aren't correctly set, and we cannot set those flags because we don't know which HWLOC string version we are getting. What we need is for the Slurm plugin to:
|
We're executing option (1), because it's the right packaging thing to do anyway (in this case, we're shipping RPMs or DPKG files, and have no excuse for using internal packages). And I don't want to have the fight over (2) with our customers :). But we wanted to get this issue filed because we wanted to raise general awareness with the community. |
I hear you about the customer fight 😄 I checked the HWLOC versions and it appears that we can still use the same flags and API for parsing the XML string - we just need to know that it is coming from a version that is potentially incompatible (e.g., it was generated by v2.1.0 and we have v2.0.1 or v1.11.x). Unfortunately, I'm not sure we have a way to detect this right now, minus the changes I proposed above. I do see one thing we can do - I'll post a PR that would at least let us continue without erroring out. |
Thank you for taking the time to submit an issue!
Background information
In some newer OS installations (e.g. Ubuntu 20.04), the version of hwloc seems to have some incompatibility with the internal hwloc of Open MPI 4.1.1 (2.01). The particular error we are seeing is related to topology xml generated by hw-info. When using hwloc-info 2.1.0 on an Amazon m6g.xlarge instance (the same appears to be true of other Arm instances), part of the generated XML includes the "Die" specifier. This string cannot be ingested by the hwloc inside of Open MPI 4.1.1. [See: https://github.com/open-mpi/hwloc/commit/3e8a1c8b19173f1aa0a584ce738a916c35f0e99c].
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
4.1.1
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
git clone. -b v4.1.x.
If you are building/installing from a git clone, please copy-n-paste the output from
git submodule status
.Please describe the system on which you are running
Details of the problem
Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.
The problem is in parsing the XML topology file that comes out of hw-info. Below I include the results on an older version of Ubuntu (18.04) and a newer version (20.04) on an m6g.xlarge (it's not specific to this instance type, but to Arm and the "Die" specifier).
Ubuntu 18.04/Arm [This Works == NOT an issue]
ubuntu@ip-172-31-10-128:/usr/bin$ ./hwloc-info
depth 0: 1 Machine (type #1)
depth 1: 1 Package (type #3)
depth 2: 1 L3Cache (type #4)
depth 3: 4 L2Cache (type #4)
depth 4: 4 L1dCache (type #4)
depth 5: 4 L1iCache (type #4)
depth 6: 4 Core (type #5)
depth 7: 4 PU (type #6)
Special depth -3: 1 Bridge (type #9)
Special depth -4: 3 PCI Device (type #10)
Special depth -5: 1 OS Device (type #11)
ubuntu@ip-172-31-10-128:/usr/bin$ ./hwloc-info --version
hwloc-info 1.11.9
Ubuntu 20.04/Arm [Breaks == Is an issue]
ubuntu@ip-172-31-8-157:/usr/bin$ ./hwloc-info
depth 0: 1 Machine (type #0)
depth 1: 1 Package (type #1)
depth 2: 1 L3Cache (type #6)
depth 3: 4 Die (type #19) <=================
depth 4: 4 L2Cache (type #5)
depth 5: 4 L1dCache (type #4)
depth 6: 4 L1iCache (type #9)
depth 7: 4 Core (type #2)
depth 8: 4 PU (type #3)
Special depth -3: 1 NUMANode (type #13)
Special depth -4: 1 Bridge (type #14)
Special depth -5: 3 PCIDev (type #15)
Special depth -6: 3 OSDev (type #16)
ubuntu@ip-172-31-8-157:/usr/bin$ ./hwloc-info --version
hwloc-info 2.1.0
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
The text was updated successfully, but these errors were encountered: