-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(storage): fix stream termination in MRD. #11432
fix(storage): fix stream termination in MRD. #11432
Conversation
shubham-diwakar
commented
Jan 10, 2025
•
edited
Loading
edited
- Make CloseSend() call before releasing resource.
- Drain inbound response from the stream.
storage/grpc_client.go
Outdated
if err := mr.stream.CloseSend(); err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also drain the stream after the CloseSend (as we do in drainInboundStream()
, receiving from stream until we get a non-nil error) to make sure its resources are released? See grpc/grpc-go@365770f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the best practice, applied this.
Using stream.recv just for determination of error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually i am not sure that will be get any outputs here if we call stream.Recv post CloseSend(). What if all the responses where consumed by streamReceiver go routine?
Although there are some cases when we close the stream even with requests added i have drained responses there.
LMK your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Recv() continually returns the err (io.EOF or something else) once the stream is done? This should be easy to check with a toy program.
However, multiple concurrent calls to Recv() are not allowed, so if streamReceiver goroutine may be calling Recv(), you have to be careful not to call Recv() elsewhere until streamReceiver is done. I think it's probably easiest for all Recv() calls to live on that one goroutine.
It's maybe easiest to call CloseSend() on the same goroutine which calls Send()? Then you have one goroutine for Send/CloseSend, and another for Recv, and user code can cancel the context then call Close() to trigger a cancellation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Chris for the suggestion.
That kind of simplifies structure too.
Now one go routine has all Recv() calls and other has all Send()/CloseSend() calls.
Ok so added some integration tests for context cancellation and abrupt close. For abrupt close we just close the client server connection, For normal scenario we would basically close client server connection drain responses in case we have active range and then tear down resources which would ideally stop returning the cancelled errors i believe on server side. |
mr.closeManager <- true | ||
mr.closeReceiver <- true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These seem to be buffered channels. So it's not guaranteed that the receivers would have read the message when execution resumes. Even if thee were unbuffered channels, the sender would get unblocked as soon as the receiver gets the message. The sender and receiver would race to lock the mutex and the sender could win. This could result in the context getting cancelled before the stream is drained.
Made sure that we receive EOF error or any permanent error from server on inbound stream(recv) and on outbound stream made sure to call closesend before calling the cancel(). Seeing ok response now on server side dashboards, Ref: https://screenshot.googleplex.com/8qAqijdXnohNeDi. |
Your PR has conflicts that you need to resolve before merge-on-green can automerge |
This required some modifications on merge with the commit that changed callback semantics to bytes written. |
This reverts commit 3d4e62f.