-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow pinning of the co-located database #306
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some initial comments as you work on getting the CI/CD working
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## develop #306 +/- ##
===========================================
+ Coverage 86.78% 87.12% +0.33%
===========================================
Files 59 59
Lines 3482 3518 +36
===========================================
+ Hits 3022 3065 +43
+ Misses 460 453 -7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave some initial thoughts in a quick first pass while the cli was getting sorted out; lmk what you think!
- Refactor some of the tests to separate creation of experiment and db - Pin colocated database cpus based on the custom_pinning option instead of two different inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more initial thoughts for you while we sort out the CI
Throw in more tests for fun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (pending CI and Docs)!! Thanks for all the hard-work on this one!
"--loadmodule", | ||
CONFIG.redisai | ||
] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh perfectly styled! :)
Questions were addressed offline and current implementation was carefully checked with Toast
This updates the strategy for adding a co-located deployment for simulation and database. Prior, the co-located database was always pinned to the last N logical processors on a machine. In the case of a Slurm machine, user-specified bind settings were being overwritten leading to a general inflexibility for users trying to maximize performance. Additionally,
limit_app_cpus
was not working as intended because only the launcher command was being pinned and not the application launched by the launcher command.The changes here provide two options,
limit_db_cpus
anddb_cpu_list
to control the pinning of the co-located database.limit_db_cpus
enables the pinning usingtaskset
anddb_cpu_list
is a user-provided string that specifies which processors the orchestrator should be bound to. Ifdb_cpu_list
is not provided, the database is automatically pinned to the first N logical processors.