-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Schema performance improvements #632
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## mli-feature #632 +/- ##
==============================================
Coverage ? 76.61%
==============================================
Files ? 100
Lines ? 6905
Branches ? 0
==============================================
Hits ? 5290
Misses ? 1615
Partials ? 0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some general comments and questions but overall great changes to the code.
msg_tensor = MessageHandler.build_tensor( | ||
tensor, | ||
|
||
# TODO isn't this what output descriptors are for? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we resolve these two TODO
comments? Do we need to make tickets or can they be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually a note so I remembered to ask this question for the group! We don't use OutputDescriptors
anywhere. I think the hardcoded information here can come from the OutputDescriptors
so we know how the tensor needs to be reconstructed. I'll make a ticket for further discussion and remove these TODOs.
|
||
interm = time.perf_counter() # timing | ||
request = deserialize_message( | ||
request_bytes, self._comm_channel_type, self._device | ||
) | ||
|
||
if request.input_meta and tensor_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe The logic from 248 to 264 (and the deserialization_message() content) would be better encapsulated in a unpack_request
. I think _on_iteration should have minimal manipulation of the request based on the serialiation and communication specifics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Does this make it more difficult to do perf timing though?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree, but maybe we wait to refactor _on_iteration
until we're solid with performance timing? It might make it more difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Performance improvements needed to be made in order to reduce the amount of copies we were making and de/serialization time. Instead of building a
Tensor
and then adding it to aRequest
, theRequest
holdsTensorDescriptors
and the actual tensor data is sent after the request through theFLInterface
.I was able to delete a lot of
TensorFlow
andTorch
tests that were separated out now that thebuild_tensor
has been updated tobuild_tensor_descriptor
.