Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change defaults to prefer external libevent/hwloc #5395

Merged
merged 3 commits into from
Jul 23, 2018

Conversation

jsquyres
Copy link
Member

@jsquyres jsquyres commented Jul 9, 2018

This is an internal commit I made on the way towards #5031. It is very much a WIP kind of thing; I honestly don't even remember what works / what doesn't work / what still needs to be done.

I'm pushing it to github in case someone else has time to advance this work.

@ibm-ompi
Copy link

ibm-ompi commented Jul 9, 2018

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/934424b4952d6c1d149e2c11c26b9df9

@ibm-ompi
Copy link

ibm-ompi commented Jul 9, 2018

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/952001875ca2f1598321dd7b41edcb0d

@ibm-ompi
Copy link

ibm-ompi commented Jul 9, 2018

The IBM CI (PGI Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/fe5652436261f3f0374ac153fc7a82a4

@ggouaillardet
Copy link
Contributor

@jsquyres i fixed this PR so it does not abort if event/external cannot be built and is not explicitly requested

@ggouaillardet ggouaillardet force-pushed the pr/prefer-externals branch 2 times, most recently from 7d0aaea to 27e112a Compare July 11, 2018 02:27
@ggouaillardet
Copy link
Contributor

@jsquyres as far as I am concerned, these commits can be squashed and the PR can be merged.

@ggouaillardet
Copy link
Contributor

@jsquyres can you please review this PR ?

If you cannot do that before the merge, may I suggest you squash and merge these commits, and we will fix/clean this later if needed ?

@jsquyres
Copy link
Member Author

Yep -- am doing so now. Sorry for the giant delay. We'll get this in before the v4.0.x branch.

@jsquyres jsquyres force-pushed the pr/prefer-externals branch from 27e112a to bcebb07 Compare July 18, 2018 01:16
@jsquyres jsquyres changed the title WIP: Change defaults to prefer external libevent/hwloc Change defaults to prefer external libevent/hwloc Jul 18, 2018
@jsquyres
Copy link
Member Author

@ggouaillardet I squashed, rebased, and pushed. Passes tests for me, and I'm heading offline. If it passes tests for you and CI, can you merge?

Thanks for the major assist!

@jsquyres
Copy link
Member Author

jsquyres commented Jul 18, 2018

Hmm. Actually, this m4 doesn't implement all the logic described in #5031.

We argued over that logic long and hard and finally concluded on the outline that is given in #5031. I'm not sure we want to merge this yet -- yes, it does the first step of preferring external, but it doesn't do any of the version checking, the fallback behavior, ...etc.

@ggouaillardet
Copy link
Contributor

@jsquyres I added some extra tests so by default, we do not use the external component version is less than the one of the internal version.

@ggouaillardet
Copy link
Contributor

:bot:mellanox:retest

@jsquyres
Copy link
Member Author

bot:mellanox:retest

Copy link
Member Author

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this still needs some work.

[AC_MSG_RESULT([no])
AC_MSG_WARN([external hwloc version is less than internal version (2.0)])
AC_MSG_WARN([using internal hwloc])
opal_hwloc_external_support=no])])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on why all this code moved up to the previous block. The code was previously split into two distinct parts:

  1. Do we have external hwloc at all.
  2. If we have it, does that external hwloc have all the things we need?

This is not a deal breaker -- it's just odd that you moved some of the "does external hwloc have the things we need?" (i.e., the version check) but left other things in the lower block (i.e., the XML check).

[AC_LANG_PROGRAM([[#include <hwloc.h>]],
[[
#if HWLOC_API_VERSION < 0x00020000
#error "hwloc API version is less than 0x00020000"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to have this version by reference instead of by value? I.e., I imagine that this will be one more place to forget to update if/when we update the embedded version of hwloc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know ... strictly speaking, this is not even the library version (e.g. 2.0.1) but the API version (e.g. 2.0.0).

@bgoglin is there any way to test the __library__version instead of the API version at compile time ?
for example, libevent defines _EVENT_NUMERIC_VERSION.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggouaillardet Don't you also need to test that $with_hwloc wasn't given without any path (and so has a yes value)? I thought that test -z just means it wasn't given at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhc54 you are right, and I just pushed a commit that fixes that.

[AC_MSG_RESULT([yes])],
[AC_MSG_RESULT([no])
AC_MSG_ERROR([Cannot continue])])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per #5031, if external support is not available, we need to emit a message at the bottom of this configure.m4:

  1. If the user explicitly asked for external support, print why the external hwloc is not suitable and then abort.
  2. If the user did not explicitly ask for external support, print why the external hwloc is not suitable and then allow processing to continue (which will end up selecting the internal hwloc).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed in the latest commit

AC_MSG_ERROR([Cannot continue])])])])],
[AC_MSG_RESULT([(default search paths)])])
AS_IF([test ! -z "$with_libevent_libdir" && test "$with_libevent_libdir" != "yes"],
[opal_event_libdir="$with_libevent_libdir"])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't all of the above be similar to hwloc? I.e., shouldn't the external/configure.m4 between hwloc and libevent be pretty similar in structure? The exact tests will be different, of course (for various libraries, functions, header files, versions, ...etc.), but the structure should be similar -- especially when checking for lib vs. lib64, external-vs-non-external, ...etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first glance, my answer is yes.

But this is a revamp that is not in the scope of this PR.
Should hwloc or libevent be the reference ?

[[
#if _EVENT_NUMERIC_VERSION < 0x02001500
#error "libevent API version is less than 0x02001500"
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs another version check to ensure it is >= the internal version of libevent (just like hwloc).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the purpose of this test is to ensure this.
(and unlike in hwloc we do not explicitly require a minimal external version for libevent)

@jsquyres
Copy link
Member Author

bot:mellanox:retest

@rhc54
Copy link
Contributor

rhc54 commented Jul 20, 2018

Looks good to me - @jsquyres is on vacation. I'd go ahead and merge it - the revamp is definitely outside scope and IMO not worth the trouble.

@bgoglin
Copy link
Contributor

bgoglin commented Jul 21, 2018

@ggouaillardet (replying to a comment that might have been lost between patch updates) There's hwloc_get_api_version() which returns the HWLOC_API_VERSION built inside the lib.

ggouaillardet and others added 3 commits July 23, 2018 09:20
Signed-off-by: Gilles Gouaillardet <[email protected]>
Signed-off-by: Jeff Squyres <[email protected]>
Switch from #-style to dnl-style.

Signed-off-by: Jeff Squyres <[email protected]>
@ggouaillardet
Copy link
Contributor

@bgoglin I want to test the hwloc version at configure time, so the HWLOC_API_VERSION macro is a better fit than the hwloc_get_api_version() subroutine (it would required to link and run a test program, and that would not work when cross-compiling).

My main concern is HWLOC_API_VERSION is an API version and not a library version.

For example, Open MPI master embeds hwloc v2.0.1, but HWLOC_API_VERSION=0x00020000.
With this PR, that means we would prefer an external hwloc 2.0.0 rather than the internal hwloc 2.0.1
(same API version, but some missing bug fixes), so I think we need an other (set of) macros, such as
HWLOC_LIBRARY_VERSION or HWLOC_MAJOR_VERSION, HWLOC_MINOR_VERSION and HWLOC_RELEASE_VERSION.

@ggouaillardet ggouaillardet merged commit 92d8941 into open-mpi:master Jul 23, 2018
@ashleypittman
Copy link

ashleypittman commented Jul 23, 2018

We've just hit some build issues which I think are probably the result of this pull request, our environment is that we want to test pmix so are using upstream pmix along with ompi and therefore by necessity are also using external libevent, but I think we're using the system provided libevent rather than a specific version or build.

Our configure like is this:

./configure --with-platform=optimized --enable-orterun-prefix-by-default --prefix=/testbin/ompi --with-pmix=/testbin/pmix --disable-mpi-fortran --enable-contrib-no-build=vt --with-libevent=external --with-hwloc=/testbin/hwloc

And the build is then failing shortly after with this:

13:58:23 --- MCA component pmix:ext2x (m4 configuration macro)
13:58:23 checking for MCA component pmix:ext2x compile mode... dso
13:58:23 checking if external component is version 2.x... yes
13:58:23 configure: WARNING: EXTERNAL PMIX SUPPORT REQUIRES USE OF EXTERNAL LIBEVENT
13:58:23 configure: WARNING: LIBRARY. THIS LIBRARY MUST POINT TO THE SAME ONE USED
13:58:23 configure: WARNING: TO BUILD PMIX OR ELSE UNPREDICTABLE BEHAVIOR MAY RESULT
13:58:23 configure: error: PLEASE CORRECT THE CONFIGURE COMMAND LINE AND REBUILD

Please let me know if I should file this as a ticket directly.

@ggouaillardet
Copy link
Contributor

Thanks for the report and sorry for the trouble.

I will have a look at it, this is something that will be hopefully quick to fix.

@rhc54
Copy link
Contributor

rhc54 commented Jul 23, 2018

Fix is in the oven

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants