-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple Status Checking #655
Conversation
@@ -144,6 +145,10 @@ def start( | |||
res = _assert_schema_type(self._connector.send_request(req), DragonRunResponse) | |||
return LaunchedJobID(res.step_id) | |||
|
|||
def get_status(self, *launched_ids: LaunchedJobID) -> tuple[SmartSimStatus, ...]: | |||
infos = self._get_managed_step_update(list(launched_ids)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
infos plural intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
infos
, plural intentional! The return type of _get_managed_step_update
is list[StepInfo]
, so I pluralized as a reminder that this is a "collection of many StepInfo
instances".
If its unclear or confusing to read, I'm more than willing to rename to info_list
or similar!
smartsim/settings/dispatch.py
Outdated
@abc.abstractmethod | ||
def get_status( | ||
self, *launched_ids: LaunchedJobID | ||
) -> tuple[SmartSimStatus, ...]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, this assumes with this call:
stat_1, stat_2, stat_3, ... = launcher.get_status(id_1, id_2, id_3, ...)
stat_1
is the status of the job with id id_1
, stat_2
is the status of the job with id id_2
, etc. and is behavior that is relied upon by the Experiment
.
Do we like this or is this too much of an "implied constraint" from the protocol?
Should the return type here actually be a Mapping[LaunchJobID, SmartSimStatus]
or maybe a Iterable[tuple[LaunchJobID, SmartSimStatus]]
? That way its more obvious to users looking to write and register their own launchers the intention of what this method should return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it might not be implied for all SS users that stat_1 and stat_2 will map to id_1 and id_2 but I might be wrong. I would rather get back a Mapping or Iterable -> Im wondering what will be easier for the user to inspect, maybe an iterable since you can use a for loop? But then again I can search via key in a Mapping - am I on to something here or is this so far off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed offline to make this protocol method return a Mapping
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome, just a couple of comments
smartsim/experiment.py
Outdated
:returns: A tuple of statuses with order respective of the order of the | ||
calling arguments. | ||
""" | ||
ids_ = set(ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you mind commenting this code? Sorry I always ask that MERP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely!
Sorry I always ask that MERP
Don't be sorry! If its not immediately obvious what's happening, that's a good sign comments are needed. I don't want to over comment, so I only tend to comment sections that reviewers ask for more details on!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically I'm just offloading the work of figuring out what needs comments onto you, the reviewer, lol.
smartsim/settings/dispatch.py
Outdated
@abc.abstractmethod | ||
def get_status( | ||
self, *launched_ids: LaunchedJobID | ||
) -> tuple[SmartSimStatus, ...]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it might not be implied for all SS users that stat_1 and stat_2 will map to id_1 and id_2 but I might be wrong. I would rather get back a Mapping or Iterable -> Im wondering what will be easier for the user to inspect, maybe an iterable since you can use a for loop? But then again I can search via key in a Mapping - am I on to something here or is this so far off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very small comments otherwise LGTM
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## smartsim-refactor #655 +/- ##
====================================================
Coverage ? 34.15%
====================================================
Files ? 109
Lines ? 6611
Branches ? 0
====================================================
Hits ? 2258
Misses ? 4353
Partials ? 0
|
Add ability for the
Experiment
to fetch the status a launched job that it started given aLaunchedJobID
. Teach theShellLauncher
andDragonLauncher
to get statuses of jobs they have launched.