-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run Layer::get_value_reconstruct_data
in spawn_blocking
#4498
run Layer::get_value_reconstruct_data
in spawn_blocking
#4498
Conversation
not ideal but an easy fix
I would prefer merge and observe the performance difference in prod/staging at some time |
1016 tests run: 974 passed, 0 failed, 42 skipped (full report)The comment gets automatically updated with the latest test results
9afb589 at 2023-06-23T15:00:04.219Z :recycle: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Minor nits. Note there is run_benchmarks
label so you can run benchmarks before merging. I havent used it but some people did.
Two comments point to possible optimizations, feel free to close them (we can get back to them later if needed).
…ne-get/get-value-reconstruct-data-spawn-blocking
…ne-get/get-value-reconstruct-data-spawn-blocking
…ne-get/get-value-reconstruct-data-spawn-blocking
This PR concludes the "async
Layer::get_value_reconstruct_data
" project.The problem we're solving is that, before this patch, we'd execute
Layer::get_value_reconstruct_data
on the tokio executor threads.This function is IO- and/or CPU-intensive.
The IO is using VirtualFile / std::fs; hence it's blocking.
This results in unfairness towards other tokio tasks, especially under (disk) load.
Some context can be found at #4154
where I suspect (but can't prove) load spikes of logical size calculation to
cause heavy eviction skew.
Sadly we don't have tokio runtime/scheduler metrics to quantify the unfairness.
But generally, we know blocking the executor threads on std::fs IO is bad.
So, let's have this change and watch out for severe perf regressions in staging & during rollout.
Changes
Layer::get_value_reconstruct_data
toLayer::get_value_reconstruct_data_blocking
Layer::get_value_reconstruct_data
async_trait
method that runsget_value_reconstruct_data_blocking
insidespawn_blocking
.spawn_blocking
requires'static
lifetime of the captured variables; hence I had to change the data flow to move theValueReconstructState
into and back out of get_value_reconstruct_data instead of passing a reference. It's a small struct, so I don't expect a big performance penalty.Performance
Fundamentally, the code changes cause the following performance-relevant changes:
get_value_reconstruct_data
call now makes a short-lived allocation becauseasync_trait
is just sugar for boxed futures under the hoodspawn_blocking
adds some latency because it needs to move the work to a thread poolspawn_blocking
plus the existing synchronous code inside is probably more efficient better than switching all the synchronous code to tokio::fs because each tokio::fs call doesspawn_blocking
under the hood.spawn_blocking
thread pool is much larger than the async executor thread pool. Hence, as long as the disks can keep up, which they should according to AWS specs, we will be able to deliver higherget_value_reconstruct_data
throughput.Slightly higher latency under regular load is acceptable given the throughput gains and expected better fairness during disk load peaks, such as logical size calculation peaks uncovered in #4154.
Full Stack Of Preliminary PRs
This PR builds on top of the following preliminary PRs
Tenant::state.send_modify
#4291, which I thought we'd need in my original plan where we would need to convertTenant::timelines
into an async locking primitive (make Tenant::timelines a tokio::sync::RwLock #4333). In reviews, we walked away from that, but these cleanups were still quite useful.Send
by usingtokio::sync Mutex
internally #4477try_write
#4485