-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add write API endpoint to write data unsync'd #25597
Comments
To add a bit to this, I think a lot of the "plumbing" is set up to do this already. Right now, when we handle an incoming write, we use the influxdb/influxdb3_write/src/write_buffer/mod.rs Lines 251 to 264 in b7fd8e2
That is what is waiting for WAL durability before responding to the caller. I think the key to this is to use the influxdb/influxdb3_wal/src/lib.rs Lines 63 to 64 in b7fd8e2
That could be called first with the FWIW, I think a parameter on the existing write API would work well. |
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL a task to do so is spawned and the code continues executing to return a successful HTTP code to the user. The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will get an error printed in the logs rather than a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case. However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload. Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all. Closes #25597
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL a task to do so is spawned and the code continues executing to return a successful HTTP code to the user. The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will get an error printed in the logs rather than a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case. However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload. Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all. Closes #25597
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL we write to the buffer without confirming that writes have made it to the WAL. The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will not get a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case. However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload. Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all. Closes #25597
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL we write to the buffer without confirming that writes have made it to the WAL. The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will not get a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case. However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload. Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all. Closes #25597
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL we write to the buffer without confirming that writes have made it to the WAL. The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will not get a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case. However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload. Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all. Closes #25597
In cases where a user does not need the guarantees that data is persisted to the WAL on write and needs faster ingest speed then the no_sync param added in this commit are what they need. Rather than waiting on a sync to the WAL we write to the buffer without confirming that writes have made it to the WAL. The upside to this is that they can ingest data faster, but there is a small risk that between writing the data and it eventually being written to object storage, that the server crashes and it's irrevocably lost. Also if the write to the WAL fails, then at most the user will not get a failed response code they can handle. The data will still be in the buffer, but will not be durable until persisted as a parquet file in this case. However, in many cases that might be acceptable. This commit expands on what's possible so that the user can use InfluxDB Core the way that works best for their workload. Note that this feature is only added for the /api/v3/write_lp endpoint. The legacy endpoints for writing can take the parameter, but won't do anything with it at all. Closes #25597
We should have a new write endpoint (or perhaps a parameter on the existing one?) that will accept writes and put it into the WAL buffer and return a success to the client prior to the WAL flush happening. It should still validate, it just won't wait for the WAL file to be persisted to object storage to return.
The text was updated successfully, but these errors were encountered: