-
Notifications
You must be signed in to change notification settings - Fork 887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hwloc2a/configure.m4: be more careful in with_cuda->enable_cuda #4257
Conversation
Be a little more deliberate about convering OMPI's --with-cuda CLI value to hwloc's --enable-cuda configure option. Also, unconditionally disable hwloc NVML support (because Open MPI is not currently using it). Signed-off-by: Jeff Squyres <[email protected]>
@jsquyres sorry I failed to reply to your question in #4251 ... To my knowledge, we don't use the CUDA part of hwloc in Open MPI. @bgoglin may be able to explain what --enable-cuda changes in hwloc, I personally don't know. Now, with that patch, we're no longer adding new dependencies nor creating mismatches between Open MPI and hwloc, so I guess it doesn't really matter. |
@sjeaugey Got it. Is there a reason we're not using CUDA stuff from hwloc? I.e., is the hwloc CUDA information not useful? Or just has no one implemented anything that uses it? (this is a curiosity question) |
It could be useful, we just never used it because it was hard to do so. The main reason is that we're using the same BTLs for CPU and GPU communication. The Open IB BTLs are chosen during MPI_Init based on their distance with the CPU. When the CUDA-aware code is initialized (during the first CUDA transfer), we should rediscover the BTLs and select those which are close to the GPU to use them on CUDA transfers. But that would mean a separate BTL selection for CPU and GPU transfers. UCX may implement that better. |
@sjeaugey @jsquyres Enabling CUDA in hwloc adds hwloc objects that correspond to GPU, hence giving you their locality in the hwloc tree. Mostly useful for two cases:
(1) is likely done in the application, not in OMPI itself. Aside of this, I don't see much use cases for OMPI using a CUDA-aware hwloc. Moreover, there are also ways to get GPU affinity from hwloc without enabling CUDA support (using hwloc/cuda.h or hwloc/cudart.h). |
Indeed selecting a NIC/HCA close to the GPU is a good use case extending (2). |
Ok. Based on these comments, it sounds like I shouldn't whack the logic I just added (and replace it with |
+1 |
More on this. We need to set enable_cuda=no in hwloc in all cases. CUDA support in Open MPI needs CUDA includes to build, but it is in fact using the CUDA driver API and loading symbols dynamically (so as I understand it, the library can work on any system, even those without CUDA). Enabling CUDA in hwloc makes a hard dependency on CUDA, so this is clearly a regression. Also, currently, configure seems to fail to include the relevant -L so master is broken. https://mtt.open-mpi.org/index.php?do_redir=2485 |
There is no usage of CUDA hwloc objects in the v2.0.x branch, and linking in CUDA can cause problems (per open-mpi#4257 (comment)). Partially cherry-picked from c341b53. Signed-off-by: Jeff Squyres <[email protected]>
There is no usage of CUDA hwloc objects in the v2.0.x branch, and linking in CUDA can cause problems (per open-mpi#4257 (comment)). Partially cherry-picked from c341b53. Signed-off-by: Jeff Squyres <[email protected]>
There is no usage of CUDA hwloc objects in the v2.0.x branch, and linking in CUDA can cause problems (per open-mpi#4257 (comment)). Partially cherry-picked from c341b53. Signed-off-by: Jeff Squyres <[email protected]>
There is no usage of CUDA hwloc objects in the v3.0.x branch, and linking in CUDA can cause problems (per open-mpi#4257 (comment)). Partially cherry-picked from c341b53. Signed-off-by: Jeff Squyres <[email protected]>
There is no usage of CUDA hwloc objects in the v2.x branch, and linking in CUDA can cause problems (per open-mpi#4257 (comment)). Partially cherry-picked from c341b53. Signed-off-by: Jeff Squyres <[email protected]>
There is no usage of CUDA hwloc objects in the v2.0.x branch, and linking in CUDA can cause problems (per open-mpi#4257 (comment)). Partially cherry-picked from c341b53. Signed-off-by: Jeff Squyres <[email protected]> (cherry picked from commit f28fcbe)
Be a little more deliberate about convering OMPI's --with-cuda CLI value to hwloc's --enable-cuda configure option.
Also, unconditionally disable hwloc NVML support (because Open MPI is not currently using it).
Signed-off-by: Jeff Squyres [email protected]
This is a followup to the original master PR #4245, and the subsequent comments from @ggouaillardet, @bgoglin, and @sjeaugey on #4251.