-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for resource matching service #386
Conversation
15a11bc
to
61daec2
Compare
Codecov Report
@@ Coverage Diff @@
## master #386 +/- ##
=========================================
+ Coverage 75.88% 76% +0.12%
=========================================
Files 59 64 +5
Lines 9829 10175 +346
=========================================
+ Hits 7459 7734 +275
- Misses 2370 2441 +71
Continue to review full report at Codecov.
|
ea54bfa
to
f5b4f1f
Compare
@SteVwonder: Ok, I added a few test cases and adjusted minor things (e.g., changing |
Requested @tpatki as a reviewer as well. |
f5b4f1f
to
e206289
Compare
Somehow I wonder if this is related to the fact that I have the containing directory specified in the dependency...
|
@dongahn: I'll test this tmrw in detail, just sending a note as a quick update. |
Hmmm. I moved resource module build rules to its own directory (
Probably missing something more fundamental... |
Seems that I just needed to add |
5dfee78
to
ee59014
Compare
t/sharness.d/sched-sharness.sh
Outdated
FLUX_MODULE_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched/.libs" | ||
FLUX_EXEC_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched" | ||
FLUX_MODULE_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched/.libs:${SHARNESS_BUILD_DIRECTORY}/resource/modules/.libs" | ||
FLUX_EXEC_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched:${SHARNESS_BUILD_DIRECTORY}/resource/modules" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since your flux-resource
utility is a script and not a built program, I think you will want to point to ${SHARNESS_TEST_SRCDIR}/../resource/modules
here and not to the "build" directory.
Ah. That makes a lot of sense and likely resolve the current issue. Thanks @grondo! |
@dongahn, is the Does that make any sense? |
It meant to be a utility for testing for now. In terms of first class scheduler UIs, what I would like to do is to get an end to end demonstration between the new exec system, R, this matching service and the upcoming scheduler loop service first before standardizing user facing interfaces. It is unclear if For now, I'd be happy to change the name to |
ee59014
to
c8a4f57
Compare
t/sharness.d/sched-sharness.sh
Outdated
FLUX_MODULE_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched/.libs" | ||
FLUX_EXEC_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched" | ||
FLUX_MODULE_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched/.libs:${SHARNESS_BUILD_DIRECTORY}/resource/modules/.libs" | ||
FLUX_EXEC_PATH_PREPEND="${SHARNESS_BUILD_DIRECTORY}/sched:${SHARNESS_BUILD_DIRECTORY}/../resource/modules" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For FLUX_PATH_PREPEND
, try s/SHARNESS_BUILD_DIRECTORY/SHARNESS_TEST_SRCDIR/
.
Also, if the flux-resource utility is just for testing, you could save some trouble by placing it under ./t
(we have some test command like this in flux-core)
7024402
to
0f3d8b0
Compare
Pick this up from flux-core. Fix a side-effect where eval turns the semicolon into a command separator.
Use the factory pattern to allow the upper layer to easily instantiate an object of different match policy class.
Factor out the jobinfo data structure and a utility function to a new source file as this will be used by both resource-query and upcoming resource module. Adjust resource-query code as well. Update Makefile.am
Name: resource. Use the resource scheduler infrastructure code to populate the resource graph database. Use the request handlers in place of the control loop in resource-query to receive various match requests: e.g., match allocate and allocate_orelse_reserve. Receive a jobspec directly from the RPC message and return the matched R in response. Note that this will likely change later, getting some handle from the user (e.g., schedule loop service) of this service to find the jobspec from KVS and put the matched R into KVS instead of sending it back in the response message.
Make a convenience library out of the resource scheduling infrastructure. This reduces the compilation time. Add rules for resource module, building it against the convenience library. Update rules for resource-query to build against the same library
6f2a716
to
01380e0
Compare
Thank you all for reviewing this PR. I decided to export the HOME environment variable in the new test scripts so that we don't have to hit this problem next time Travis CI is green now. In addition, I believe I've also addressed all of the great review comments from @grondo, @SteVwonder and @trws. Please let me know if there is anything else you want me to address. From my perspective, this can go in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more minor change. Otherwise this looks ready to go in to me.
Add this command primarily for testing. Add several sub-commands matching the request handlers within the resource module. Use python's argparse module with Action class to register a handler per each sub-commmand. Support match allocate and allocate_orelse_reserve, info and cancel sub-commands.
Add makefile for flux-resource
Need for both in-build testing and install testing.
Test the correctness of the match(-allocate) handler within the resource module using flux-resource front-end command. Add data/resource/jobspecs/basics/bad.yaml, a malformed jobspec yaml file to test if flux-resource handles a Yaml exception gracefully. Use the original HOME environment variable as sharness interferes with Python's module search path. Python uses HOME to determine its local site-package path, and changing this leads to an incorrect search path. Update the Makefile rule.
Test the functionality of the match(-allocate_orelse_reserve) handler within the resource module using flux-resource front-end command. Use the original HOME environment variable as sharness interferes with Python's module search path. Python uses HOME to determine its local site-package path, and changing this leads to an incorrect search path. The Makefile rule updated.
Test the functionality of the cancel and info handlers within resource module using flux-resource front-end command. Use the original HOME environment variable as sharness interferes with Python's module search path. Python uses HOME to determine its local site-package path, and changing this leads to an incorrect search path. Update Makefile rule.
01380e0
to
170f972
Compare
OK. Forced a push! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dongahn. LGTM! Anyone else have any feedback? If there is no more feedback, I will push the button tomorrow.
Ping? |
My bad. Merging now. |
Thanks @SteVwonder! |
This PR addresses all of the comments I had received on the posting of the previous experimental PR.
Refactor, augment, modify the resource scheduling infrastructure used by both the
resource
module andresource-query
utility.Add the
resource
matching service module that uses this infrastructure to populate the resource graph data store and expose four request handlers (match
,cancel
,info
, andnext_jobid
)next_jobid
is only needed for testing.Add
flux-resource.py
as the front-end command that can interact withresource
module. This is essential to drive our tests when we don't have the upcoming schedule loop service.Don't merge this yet as we have a few more steps before landing this. The proposed remaining steps are:
Add tests. They will focus on testing the handler and RPC logic as I prefer using
resource-query
for the general correctness and performance tests for the resource scheduling infrastructure (which I already have a bunch).Squash bugs we found during adding tests
Land this
Merge in @SteVwonder's PR resource: support for hwloc ingestion #385 so that
resource
module can be populated withhwloc
data.Add rc1 and rc3 support with the hwloc mode being default. (Another PR).
If you want to play with this:
copy
t/data/resource/jobspecs/basics/test001.yaml
andt/data/resource/grugs/tiny.graphml
into a directory first.