-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid unnecessary flushing during unary response processing #1373
Avoid unnecessary flushing during unary response processing #1373
Conversation
This reduces the number of packets sent when responding to a unary RPC from 3 to 1. This reduction in packets improves latency, presumably by the lower syscall and packet overhead. See grpc#1256
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the optimization! Note that @MakMukhi is also working on a slightly different change to make the batching work correctly (there's a subtle bug preventing it for now). I have one minor nit/discussion point in the detailed comments.
@@ -800,7 +800,7 @@ func (s *Server) processUnaryRPC(t transport.ServerTransport, stream *transport. | |||
} | |||
opts := &transport.Options{ | |||
Last: true, | |||
Delay: false, | |||
Delay: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO "Delay" here should be interpreted by sendResponse as applying to the response as a whole -- header, payload, and trailer -- not each piece independently.
So, I believe this actually should still be "false", and it shouldn't be factored into the logic in sendResponse until the final step (where looking for other writers before flushing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It seems like the calls to adjustNumWriter
and the associated flushing are at too low a level. For unary RPCs, the number of writers should be incremented before calling sendResponse
and decremented after the WriteStatus
calls. The flushing logic currently in sendResponse
is a bit difficult to reason about given the interaction with flow control and it isn't obvious to me how to pull it to a higher level. Perhaps this is the subtle bug you're referring to.
I'm happy to leave this work in your hands. Consider this PR more an indication of the importance of this work. The unnecessary flushes introduce a significant amount of latency!
@@ -893,11 +898,11 @@ func (t *http2Server) Write(s *Stream, data []byte, opts *Options) (err error) { | |||
// Reset ping strikes when sending data since this might cause | |||
// the peer to send ping. | |||
atomic.StoreUint32(&t.resetPingStrikes, 1) | |||
if err := t.framer.writeData(forceFlush, s.id, false, p); err != nil { | |||
if err := t.framer.writeData(forceFlush && !opts.Delay, s.id, false, p); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be wrong, but for the scenario in which the client has an inbound flow control which is less than the size of the server's response message, IIC this change could cause deadlock, with the server running out of flow control and the client not having yet received anything.
This looks similar to the change in #973, which ran into that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apolcyn I agree that this could lead to deadlock. Perhaps it would be sufficient to flush the buffered data whenever Write
needs to loop. Also, I'm curious what the forceFlush
logic is doing given that there is another check a few lines below to flush the data if we were the last writer.
Hey, |
@MakMukhi Yep, go for it. |
Cool, closing this PR for now then. Will have something out possibly by end of next week. |
This reduces the number of packets sent when responding to a unary RPC
from 3 to 1. This reduction in packets improves latency, presumably by
the lower syscall and packet overhead.
See #1256