Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect OpenMPI Use #186

Merged
merged 11 commits into from
Apr 20, 2022
Merged

Detect OpenMPI Use #186

merged 11 commits into from
Apr 20, 2022

Conversation

MattToast
Copy link
Member

@MattToast MattToast commented Apr 6, 2022

This PR:

  • Moves all OpenMPI command line arguments to an _OpenMPISettings base class
  • Creates MpirunSettings, MpiexecSettings, OrterunSettings which inherit from the base settings
  • All three child classes, when instanced, will check the version statement from a found MPI executable
  • If the version statement does not match that of a supported OpenMPI implementation, an SSUnsupportedError error is raised
  • Adds tests to ensure that if an unsupported MPI version is found, appropriate action is taken to notify the user

Important consideration for reviewers:
While the original issue suggests raising an exception if an unsupported MPI executable is found, it may be more pragmatic to instead log a warning/error message.

This way if user was using MpirunSettings with a non-OpenMPI version of mpirun, they could continue using the new implementation with relatively few problems. This also means we would not break backward compatibility, while still providing users with enough information to explain why MpirunSettings may not be working as they expected.

Feedback on this potential implementation change in particular would be appreciated.

@MattToast MattToast requested review from Spartee and al-rigazzi April 6, 2022 19:49
@MattToast MattToast marked this pull request as ready for review April 6, 2022 20:40
Copy link
Contributor

@Spartee Spartee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working through this one, but heres a couple thoughts:

  1. Intel MPI is a good test case here. It's version information returns
Intel(R) MPI Library for Linux* OS, Version 2019 Update 8 Build 20200624 (id: 4f16ad915)
Copyright 2003-2020, Intel Corporation.

intel mpirun is installed on osprey

  1. So with this approach, would we have a separate base class or just a separate class for the intel mpi settings. I'm guessing a separate base class since
[spartee@osprey 18:06:11 ~]$ mpirun --version
Intel(R) MPI Library for Linux* OS, Version 2019 Update 8 Build 20200624 (id: 4f16ad915)
Copyright 2003-2020, Intel Corporation.
[spartee@osprey 18:06:15 ~]$ mpiexec --version
Intel(R) MPI Library for Linux* OS, Version 2019 Update 8 Build 20200624 (id: 4f16ad915)
Copyright 2003-2020, Intel Corporation.

If this is the case, I think a ticket for implementing this for intel MPI might be in order.

  1. I like the idea of warnings instead of errors. I'm sure there is some case that we aren't expecting (like a site wrapping the mpirun command themselves, or something similar)

Lastly, we still need to make sure that all these can be detected in the create_run_settings function. I don't see any changes to that here so I'm wondering how that will work.

@codecov-commenter
Copy link

codecov-commenter commented Apr 12, 2022

Codecov Report

Merging #186 (b69076d) into develop (958877b) will increase coverage by 0.30%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #186      +/-   ##
===========================================
+ Coverage    81.50%   81.80%   +0.30%     
===========================================
  Files           57       57              
  Lines         2968     2973       +5     
===========================================
+ Hits          2419     2432      +13     
+ Misses         549      541       -8     
Impacted Files Coverage Δ
smartsim/settings/settings.py 65.62% <ø> (ø)
smartsim/settings/__init__.py 100.00% <100.00%> (ø)
smartsim/experiment.py 80.98% <0.00%> (ø)
smartsim/settings/base.py 94.28% <0.00%> (+0.16%) ⬆️
smartsim/_core/utils/helpers.py 74.41% <0.00%> (+1.16%) ⬆️
smartsim/_core/control/jobmanager.py 94.03% <0.00%> (+1.57%) ⬆️
smartsim/_core/control/controller.py 82.71% <0.00%> (+1.74%) ⬆️

Copy link
Contributor

@Spartee Spartee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple things:

  • Does this work with create_run_settings? Most of your tests seem to manually instantiate the instance.
  • Orchetrator? It doesn't have to for the purposes of this PR, but lets make sure to put a ticket in the backlog if it doesn't
  • lastly, update the documentation

specifically:

Copy link
Collaborator

@al-rigazzi al-rigazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @Spartee, since we are editing these files (and these classes), let's make sure we have tests for create_run_settings with different MPI flavors (which will be run if the selected launcher command is available).

@MattToast
Copy link
Member Author

Addressing review comments in no particular order:

  • I added rudimentary functionality to create_run_settings and have added test to ensure that it works as expected. This works for the time being but will likely need to be replaced if/when we start supporting multiple mpirun/mpiexec implementations.
  • I gave a quick update to the documentation to list the new run settings classes
  • As part of updating the documentation, I made sure to add docstrings the the new user facing classes
  • I have not created an intel-based MpirunSettings or MpiexecSettings, nor integrated the new settings classes with the orchetrator. I plan to open new tickets to address both of these tasks.

@MattToast MattToast requested a review from Spartee April 15, 2022 21:43
@al-rigazzi al-rigazzi self-requested a review April 19, 2022 17:57
Copy link
Contributor

@Spartee Spartee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work

@MattToast MattToast merged commit 38cb111 into CrayLabs:develop Apr 20, 2022
@MattToast MattToast deleted the ompi-rs branch April 25, 2022 19:22
al-rigazzi pushed a commit to al-rigazzi/SmartSim that referenced this pull request May 16, 2022
Adds new run settings classes for using different
OpenMPI executables (mpiexec, orterun). Because
the executables share runtime arguments with mpirun
much the functionality has be transfered to a base class
that all OpenMPI run settings inherit from. Adds tests
and docs updates to reflect changes.

[ committed by @MattToast  ]
[ reviewed by @Spartee @al-rigazzi ]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants