Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate PalsMpiexecSettings into Experiment factory methods #343

Merged
merged 9 commits into from
Aug 25, 2023

Conversation

MattToast
Copy link
Member

@MattToast MattToast commented Aug 18, 2023

Registers the PalsMpiexecSettings class in SmartsSim's Experiment factory methods so that the correct _BaseMpiSettings class is returned depending on the launcher the user is intending to use:

>>> pbs_exp = Experiment(launcher="pbs", ...)
>>> rs_1 = pbs_exp.create_run_settings(run_command="mpiexec", ...)
>>> type(rs_1)  # returns a OpenMPI compliant RunSettings
<smartsim.settings.mpiSettings.MpiexecSettings object at 0x7f9fe02eee00>
>>> pals_exp = Experiment(launcher="pals", ...)
>>> rs_2 = pals_exp.create_run_settings(run_command="mpiexec", ...)
>>> type(rs_2)  # returns a PALS compliant RunSettings
<smartsim.settings.palsSettings.PalsMpiexecSettings object at 0x7f9fdaf6f1c0>

@codecov
Copy link

codecov bot commented Aug 18, 2023

Codecov Report

Merging #343 (c12c39f) into develop (f9e17f0) will increase coverage by 0.91%.
Report is 3 commits behind head on develop.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #343      +/-   ##
===========================================
+ Coverage    87.31%   88.23%   +0.91%     
===========================================
  Files           59       59              
  Lines         3531     3552      +21     
===========================================
+ Hits          3083     3134      +51     
+ Misses         448      418      -30     
Files Changed Coverage Δ
smartsim/_core/control/controller.py 84.15% <ø> (ø)
smartsim/_core/launcher/launcher.py 100.00% <ø> (ø)
smartsim/database/orchestrator.py 83.91% <ø> (ø)
smartsim/settings/palsSettings.py 89.13% <ø> (+2.17%) ⬆️
smartsim/settings/settings.py 75.00% <100.00%> (+0.80%) ⬆️

... and 8 files with indirect coverage changes

@MattToast MattToast changed the title Integread PalsMpiexecSettings into Experiment factory methods Integrated PalsMpiexecSettings into Experiment factory methods Aug 18, 2023
@MattToast MattToast changed the title Integrated PalsMpiexecSettings into Experiment factory methods Integrate PalsMpiexecSettings into Experiment factory methods Aug 18, 2023
@MattToast MattToast requested review from al-rigazzi and ashao August 21, 2023 18:18
Copy link
Collaborator

@ashao ashao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this in!

Copy link
Collaborator

@al-rigazzi al-rigazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I gotta ask if we ran the tests on any PALS machine because I'm the baaad guy

@@ -64,6 +64,7 @@
by_launcher: t.Dict[str, t.List[str]] = {
"slurm": ["srun", "mpirun", "mpiexec"],
"pbs": ["aprun", "mpirun", "mpiexec"],
"pals": ["mpiexec"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also support aprun here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is not yet, as I remember there were some issues. But we will have to in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm inclined to agree as I didn't specifically test against aprun. I'll be sure to throw in a ticket to track the future work needed!

@MattToast
Copy link
Member Author

@al-rigazzi I manually scaled down a couple of the on_wlm tests to run on an internal 2 node EX/PBS/PALS machine, and everything seemed to be working as expected, but I was unable to run the test suite in full.

LMK if you think we should hold this off until we can do a more substantial test run!!

@MattToast MattToast self-assigned this Aug 23, 2023
@MattToast MattToast added area: settings Issues related to Batch or Run settings area: launcher Issues related to any of the launchers within SmartSim area: workload manager Issues specific to workload managers labels Aug 24, 2023
@MattToast MattToast merged commit f0d510d into CrayLabs:develop Aug 25, 2023
@MattToast MattToast deleted the nice-pals branch September 11, 2023 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: launcher Issues related to any of the launchers within SmartSim area: settings Issues related to Batch or Run settings area: workload manager Issues specific to workload managers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants