-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add no_sync write_lp param for fast writes #25902
Conversation
self.wal.write_ops(ops).await?; | ||
if no_sync { | ||
let wal = Arc::clone(&self.wal); | ||
tokio::spawn(async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't require a spawned task here. There's already a function on the WAL called buffer_op_unconfirmed
(see https://github.com/influxdata/influxdb/blob/main/influxdb3_wal/src/lib.rs#L70). Just update this to instead take a vec and you'll be able to do this without a spawned task. It puts it into a mutex'd vec that gets used when the flush happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For additional context, this was described in the original issue as well: #25597 (comment)
b1dfdc3
to
9645fd9
Compare
b6b35e2
to
8324f56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question, otherwise LGTM
influxdb3_wal/src/object_store.rs
Outdated
@@ -905,419 +904,6 @@ mod tests { | |||
use std::any::Any; | |||
use tokio::sync::oneshot::Receiver; | |||
|
|||
#[tokio::test(flavor = "multi_thread", worker_threads = 2)] | |||
async fn write_flush_delete_and_load() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove this test? If covered elsewhere, probably better to have it gone as it's likely to be brittle given it's down in the weeds of creating the contents manually. Would definitely be better to have a refactored one that uses test helpers that build LP into batches, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't intentionally remove this test. In fact I don't even remember doing so. I'll put that back in. It should not have been removed
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL we write to the buffer without confirming that writes have made it to the WAL. The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will not get a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case. However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload. Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all. Closes #25597
8324f56
to
32f027a
Compare
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL a task to do so is spawned and the code continues executing to return a successful HTTP code to the user.
The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will get an error printed in the logs rather than a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case.
However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload.
Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all.
Closes #25597