Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hwloc symbol conflicts #6768

Open
markalle opened this issue Jun 19, 2019 · 9 comments
Open

hwloc symbol conflicts #6768

markalle opened this issue Jun 19, 2019 · 9 comments
Labels

Comments

@markalle
Copy link
Contributor

I want to open a topic for discussion about how we "should" solve it and hash out some agreement on our approach here before making a PR. There also may need to be some cross-product coordination on the solution.


summary of problem:

The issue is having multiple libraries linking against different versions of libhwloc.so, and having symbol conflicts between the hwloc versions that are ending up in the same namespace.

The first place we hit this bug was using Mellanox hcoll, although in principle this same conflict ought to happen if an end user has their app linked against a libhwloc. Anyway right now Mellanox ships a libhcoll.so that contains some form of statically built hwloc, so nm libhcoll.so includes

00000000000c9620 T hwloc_bitmap_alloc
00000000000c96d0 T hwloc_bitmap_alloc_full
...etc

and I think it's currently built against one of the hwloc-1.something versions.

If we build OMPI using --with-hwloc/--with-hwloc-libdir and a libhwloc.so at version hwloc-2.something all the symbols are piling into the same namespace and are stepping on each other.

Currently a build of OMPI using --with-hwloc will have libhwloc.so as a direct dependency all over the place in libmpi.so etc so the namespace isn't exclusively loaded by dlopen()/dlsym() and even if it was I think the current MCA loads are still RTLD_GLOBAL so I don't think we're currently very close to being able to expect OMPI's libhwloc.so to be dlopen()ed into a private namespace.


how to reproduce:

  • build hwloc2 (configure --prefix=; make; make install)
  • build OMPI master using --with-hwloc= --with-hwloc-libdir= pointing
    at the above hwloc2, and --with-hcoll= mellanox's hcoll library.
    (I'm using MLNX_OFED_LINUX-4.3-1.0.1.0 at the moment but I suspect
    other versions could reprodue as well)
  • run over 2 hosts with enough -np to touch two+ sockets

For me a bunch of ranks will print

[sbgp_basesmsocket_component.c:392:hmca_map_to_logical_socket_id_hwloc] BASESMSOCKET SBGP: BASESMSOCKET: HWLOC failed to initialize for some reason


almost solution:

HWLOC has an option

--with-hwloc-symbol-prefix="ompi_"

that's a good start to avoiding conflicts.

And OMPI has a corresponding option

with_hwloc_symbol_prefix="ompi_"

I think these get us 90% of the way to a solution, but I'm still concerned about the library name "libhwloc.so.15" for example. If OMPI built and used a libhwloc.so.15 that had the symbol_prefix switched to ompi_hwloc_* there's nothing keeping the user from having their own vanilla "libhwloc.so.15" built with regular hwloc_* symbols as part of their app.

So I think we also need to rename the libhwloc used by OMPI.


proposal:

Add another option to the HWLOC build system to rename its library, eg

  --with-hwloc-library-name="hwloc_ompi"

And add a corresponding option to OMPI to use it, eg

with_hwloc_library_name="hwloc_ompi_"

that would cause various links that currently say -lhwloc to become -lhwloc_ompi

And I think Mellanox ought to use the existing

    --with-hwloc-symbol-prefix="mlnx_"

or something like that too otherwise the libhwloc they build into libhcoll.so has the potential to conflict with a user's app even if we make the above changes to OMPI.

Before I make an OMPI PR and hwloc PR I want to see if people agree this is the right approach. In our Spectrum MPI build of OMPI we started doing this already and it's working there.

I could imagine an alternate approach where we don't have anything in OMPI linked against libhwloc.so and instead rely on dlopen/dlsym from an MCA and have that done RTLD_LOCAL so we'd have considerable segregation and then maybe everybody could actually use "hwloc_*" symbols without conflict. But I lean toward changing the symbol prefix and the library name instead as the simpler solution.

@jsquyres
Copy link
Member

It sounds like the Mellanox hcoll component is not doing the right thing -- they should be using --with-hwloc-symbol-prefix. Especially if they are shipping a binary. They should probably fix that.

Outside of that, though, I'm confused as to what the actual problem is.

  1. If Open MPI compiles with its internal hwloc, we already name-shift the symbols (note: to OPAL, not OMPI), and there's no explicitly libhwloc dependency on any of Open MPI's libraries because we slurp / ingest the hwloc library into libopen_pal.
  2. If you compile Open MPI against an external hwloc, then -- by design -- we're using the symbols from that external hwloc. It's not meant to be a custom hwloc for Open MPI -- it's usually a system-level hwloc (e.g., installed by a Linux distro). Open MPI uses libhwloc.so just like any other package on the system that uses libhwloc.so.

I'm not quite clear on what the use case is for "I do not want to use the internal hwloc, but I'm ok using an external hwloc that is specific for Open MPI." Can you explain further?

@jsquyres
Copy link
Member

I think @markalle is out for a little while, so I talked to @jjhursey about this.

He thinks that SMPI is in a case 3:

  • SMPI needs an external PMIX.
  • PMIX needs an external hwloc.
  • So SMPI builds an external hwloc, on the argument that they don't know if the installation system will have hwloc installed, and even if it does, it doesn't know which version of hwloc will be there.
  • So PMIX uses this built-for-SMPI hwloc.
  • Open MPI can then also use this built-for-SMPI hwloc.
  • But if a user app uses hwloc and then links in their own hwloc, that's when you can have a problem.

Which is why @markalle is proposing an hwloc configure option to rename the hwloc library (which would also entail changing a bit inside PMIX's and likely OMPI's configury/build/mumble/mumble).

@jjhursey
Copy link
Member

@gpaulsen Does that fit with your understanding of the rationale? That was what I remembered of the discussion.

@rhc54
Copy link
Contributor

rhc54 commented Jun 20, 2019

Ick - that strikes me as violating a bunch of build "rules". I would advise we not support that model as it is pretty guaranteed to cause problems as you can't anticipate every corner case it might hit.

Why not just do the right thing? Require they install hwloc - it's a simple dependency to add to the rpm or whatever packaging script you use. Solves the problem without creating potentially buggy changes.

@markalle
Copy link
Contributor Author

markalle commented Jul 1, 2019

My concerns with the system installed libhwloc.so.# is putting restrictions on apps, and how it breaks in the current Mellanox environment.

Mellanox put conflicting hwloc_* symbols into their libhcoll.so and if Mellanox made that mistake I'm concerned that a user's app might do the same. We could call that an app bug but even without the user explicitly building conflicting hwloc_* symbols into their app they could use a libhwloc.so.X from hwloc-1 while OMPI is using a libhwloc.so.Y from hwloc-2 and those symbols would conflict too. That's harder for me to call an app bug.

I don't mind the internal opal_hwloc_* solution. I'm not actually sure if our PMIX build could follow suit and have both OMPI and PMIX using their own internal opal_hwloc_* or similar. That might work, but I don't really think it's better design than having a libhwloc_opal.so providing opal_hwloc_* symbols.

I think symbol prefixing like that is only legitimate in two cases:

  1. building statically, or
  2. using a renamed shared library like libhwloc_opal.so

and if we have multiple products (OMPI, PMIX) that want access to prefixed symbols, then the shared library starts sounding more appealing to me.

@gpaulsen
Copy link
Member

gpaulsen commented Jul 2, 2019

We will discuss this on next week's (July 9th) web-ex to try to make progress more quickly.

@artpol84
Copy link
Contributor

artpol84 commented Jul 9, 2019

@vspetrov FYI

@jladd-mlnx
Copy link
Member

@markalle @artpol84 we addressed this in HCOLL.

@jsquyres
Copy link
Member

jsquyres commented Jul 9, 2019

@artpol84 @vspetrov @jladd-mlnx Is this in an hcoll release? Should this be added to https://www.open-mpi.org/faq/?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants