-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
codec/proto: reuse of marshal byte buffers #3167
Conversation
13e18cb
to
506dba6
Compare
Well this turned out to be much more complicated than I originally thought:
Previously, I returned the buffer back to the codec layer as soon as the transport layer (read: loopy) was done with the buffer, but that's not a good with retries because the buffered slice of retry funcs will retain references to the payload, which may actually be the plain encoded data if no compression is used. Therefore, we may return the buffer back to the encoder only after the entire RPC is concluded, so we need to store all the buffers to return. |
506dba6
to
b1a96e0
Compare
b1a96e0
to
853ea8c
Compare
853ea8c
to
3c03235
Compare
611e345
to
6b06f6e
Compare
9839ca3
to
47a9877
Compare
Performance benchmarks can be found below. Obviously, a 8 KiB request/response is tailored to showcase this improvement as this is where codec buffer reuse shines, but I've run other benchmarks too (like 1-byte requests and responses) and there's no discernable impact on performance. We do not allow reuse of buffers when stat handlers or binlogs are turned on. This is because those two may need access to the data and payload even after the data has been written to the wire. In such cases, we never return the data back to the pool. A buffer reuse threshold of 1 KiB was determined after several experiments. There's diminished returns when buffer reuse is enabled for smaller messages (actually, a negative impact). unary-networkMode_none-bufConn_false-keepalive_false-benchTime_40s-trace_false-latency_0s-kbps_0-MTU_0-maxConcurrentCalls_6-reqSize_8192B-respSize_8192B-compressor_off-channelz_false-preloader_false Title Before After Percentage TotalOps 839638 906223 7.93% SendOps 0 0 NaN% RecvOps 0 0 NaN% Bytes/op 103788.29 80592.47 -22.35% Allocs/op 183.33 189.30 3.27% ReqT/op 1375662899.20 1484755763.20 7.93% RespT/op 1375662899.20 1484755763.20 7.93% 50th-Lat 238.746µs 225.019µs -5.75% 90th-Lat 514.253µs 456.439µs -11.24% 99th-Lat 711.083µs 702.466µs -1.21% Avg-Lat 285.45µs 264.456µs -7.35%
47a9877
to
10ad7f1
Compare
8d620dd
to
ea18dad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments otherwise LGTM. Note that vet is failing (go linter) because of some comments.
@dfawley thanks for the review, I've addressed all comments. CI should pass this time. PTAL; if you'd like to merge this in the future, let me know and I'll squash everything into two logical commits -- one for the leaktest change, one for the actual reuse change. |
FYI: travis is showing a data race that looks like it's caused by this PR. Can you take a look? |
5b7f6ec
to
3f387e8
Compare
I literally can't reproduce this locally. I've run the failing test 8000 times (yes, 8000) with the race detector turned on and every single one test passed. And now Travis is green too. I don't want to say it's a memory corruption bug, but... |
travis? 🤷♂️ |
Using a |
This reverts commit 6426751.
Performance benchmarks can be found below. Obviously, a 8 KiB
request/response is tailored to showcase this improvement as this is
where codec buffer reuse shines, but I've run other benchmarks too (like
1-byte requests and responses) and there's no discernable impact on
performance.
We do not allow reuse of buffers when stat handlers or binlogs are
turned on. This is because those two may need access to the data and
payload even after the data has been written to the wire. In such cases,
we never return the data back to the pool.
A buffer reuse threshold of 1 KiB was determined after several
experiments. There's diminished returns when buffer reuse is enabled for
smaller messages (actually, a negative impact).
Closes #2816