Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return status and actual timestamp from all environments #671

Merged
merged 9 commits into from
Feb 13, 2024

Conversation

motus
Copy link
Member

@motus motus commented Feb 9, 2024

This is the first step to making trials asynchronous. In this PR we just make sure every Environment method that returns status also returns the timestamp of that status. In the next PR we will get these timestamps and status values from the remote environment.

@motus motus self-assigned this Feb 9, 2024
@motus motus added WIP Work in progress - do not merge yet mlos-bench labels Feb 9, 2024
@motus motus marked this pull request as ready for review February 12, 2024 20:43
@motus motus requested a review from a team as a code owner February 12, 2024 20:43
@motus motus added ready for review Ready for review and removed WIP Work in progress - do not merge yet labels Feb 12, 2024
@bpkroth
Copy link
Contributor

bpkroth commented Feb 12, 2024

Also in this PR, we store the statuses and timestamps in a new table that will later serve as a scheduling queue.

Not sure I agree with this, but maybe I just am not following where you're heading with it.

I think we need to store two separate things:

  1. in progress status events of the trial (e.g., how different environments status change at different times). I think that's valuable for debugging and other stateful reasons (e.g., which environments do we still need to call teardown or cleanup on).
  2. the overall status of a trial (e.g., pending, in progress, aborting, succeeded, failed, etc.). This could be used to implement a scheduling queue.

However, if implement this with a single table, then the latter requires messy aggregates over the former to find out the overall status of each current trial.

I think that should probably just be stored as a single column in the trial table and updated explicitly with logic inside mlos_bench.

Thoughts?

@motus motus changed the title Return status and actual timestamp from all environments, including remote Return status and actual timestamp from all environments Feb 12, 2024
@motus motus enabled auto-merge (squash) February 12, 2024 23:18
@motus
Copy link
Member Author

motus commented Feb 12, 2024

Thoughts?

I think we are on the same page. This is exactly how I want to implement this. I will remove all DB-related code from this PR and move it to the next one that actually implements the queue - i.e., with the corresponding read method, unit tests, etc.

Copy link
Contributor

@bpkroth bpkroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@motus motus merged commit 08dc436 into microsoft:main Feb 13, 2024
12 checks passed
@motus motus deleted the sergiym/run/queue branch February 14, 2024 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants