Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add write API endpoint to write data unsync'd #25597

Closed
pauldix opened this issue Nov 28, 2024 · 1 comment · Fixed by #25902
Closed

Add write API endpoint to write data unsync'd #25597

pauldix opened this issue Nov 28, 2024 · 1 comment · Fixed by #25902
Assignees
Labels

Comments

@pauldix
Copy link
Member

pauldix commented Nov 28, 2024

We should have a new write endpoint (or perhaps a parameter on the existing one?) that will accept writes and put it into the WAL buffer and return a success to the client prior to the WAL flush happening. It should still validate, it just won't wait for the WAL file to be persisted to object storage to return.

@pauldix pauldix added the v3 label Nov 28, 2024
@hiltontj
Copy link
Contributor

To add a bit to this, I think a lot of the "plumbing" is set up to do this already.

Right now, when we handle an incoming write, we use the write_lp method on the write buffer, which validates the write into a WalOp::Write (and a WalOp::Catalog if necessary), then send it to the WAL with the Wal::write_ops method:

// if there were catalog updates, ensure they get persisted to the wal, so they're
// replayed on restart
let mut ops = Vec::with_capacity(2);
if let Some(catalog_batch) = result.catalog_updates {
ops.push(WalOp::Catalog(catalog_batch));
}
ops.push(WalOp::Write(result.valid_data));
// write to the wal. Behind the scenes the ops get buffered in memory and once a second (or
// whatever the configured wal flush interval is set to) the buffer is flushed and all the
// data is persisted into a single wal file in the configured object store. Then the
// contents are sent to the configured notifier, which in this case is the queryable buffer.
// Thus, after this returns, the data is both durable and queryable.
self.wal.write_ops(ops).await?;

That is what is waiting for WAL durability before responding to the caller.

I think the key to this is to use the Wal trait's buffer_op_unconfirmed method:

/// Buffer into a single larger operation in memory. Returns before the operation is persisted.
async fn buffer_op_unconfirmed(&self, op: WalOp) -> Result<(), Error>;

That could be called first with the WalOp::Catalogs, if there are any, then the WalOp::Writes to accomplish what is described above before returning a response.


FWIW, I think a parameter on the existing write API would work well.

@pauldix pauldix assigned mgattozzi and unassigned mgattozzi Dec 4, 2024
mgattozzi added a commit that referenced this issue Jan 22, 2025
In cases where a user does not need the guarantees that data is
persisted to the WAL on write and needs faster ingest speed then the
no_sync param added in this commit are what they need. Rather
than waiting on a sync to the WAL a task to do so is spawned and the
code continues executing to return a successful HTTP code to the user.

The upside to this is that they can ingest data faster, but there is a
small risk that between writing the data and it eventually being written
to object storage, that the server crashes and it's irrevocably lost.
Also if the write to the WAL fails, then at most the user will get an
error printed in the logs rather than a failed response code they can
handle. The data will still be in the buffer, but will not be durable
until persisted as a parquet file in this case.

However, in many cases that might be acceptable. This commit expands on
what's possible so that the user can use InfluxDB Core the way that
works best for their workload.

Note that this feature is only added for the /api/v3/write_lp endpoint.
The legacy endpoints for writing can take the parameter, but won't do
anything with it at all.

Closes #25597
@mgattozzi mgattozzi self-assigned this Jan 23, 2025
mgattozzi added a commit that referenced this issue Jan 24, 2025
In cases where a user does not need the guarantees that data is
persisted to the WAL on write and needs faster ingest speed then the
no_sync param added in this commit are what they need. Rather
than waiting on a sync to the WAL a task to do so is spawned and the
code continues executing to return a successful HTTP code to the user.

The upside to this is that they can ingest data faster, but there is a
small risk that between writing the data and it eventually being written
to object storage, that the server crashes and it's irrevocably lost.
Also if the write to the WAL fails, then at most the user will get an
error printed in the logs rather than a failed response code they can
handle. The data will still be in the buffer, but will not be durable
until persisted as a parquet file in this case.

However, in many cases that might be acceptable. This commit expands on
what's possible so that the user can use InfluxDB Core the way that
works best for their workload.

Note that this feature is only added for the /api/v3/write_lp endpoint.
The legacy endpoints for writing can take the parameter, but won't do
anything with it at all.

Closes #25597
mgattozzi added a commit that referenced this issue Jan 24, 2025
In cases where a user does not need the guarantees that data is
persisted to the WAL on write and needs faster ingest speed then the
no_sync param added in this commit are what they need. Rather
than waiting on a sync to the WAL we write to the buffer without
confirming that writes have made it to the WAL.

The upside to this is that they can ingest data faster, but there is a
small risk that between writing the data and it eventually being written
to object storage, that the server crashes and it's irrevocably lost.
Also if the write to the WAL fails, then at most the user will not get a
failed response code they can handle. The data will still be in the
buffer, but will not be durable until persisted as a parquet file in
this case.

However, in many cases that might be acceptable. This commit expands on
what's possible so that the user can use InfluxDB Core the way that
works best for their workload.

Note that this feature is only added for the /api/v3/write_lp endpoint.
The legacy endpoints for writing can take the parameter, but won't do
anything with it at all.

Closes #25597
mgattozzi added a commit that referenced this issue Jan 24, 2025
In cases where a user does not need the guarantees that data is
persisted to the WAL on write and needs faster ingest speed then the
no_sync param added in this commit are what they need. Rather
than waiting on a sync to the WAL we write to the buffer without
confirming that writes have made it to the WAL.

The upside to this is that they can ingest data faster, but there is a
small risk that between writing the data and it eventually being written
to object storage, that the server crashes and it's irrevocably lost.
Also if the write to the WAL fails, then at most the user will not get a
failed response code they can handle. The data will still be in the
buffer, but will not be durable until persisted as a parquet file in
this case.

However, in many cases that might be acceptable. This commit expands on
what's possible so that the user can use InfluxDB Core the way that
works best for their workload.

Note that this feature is only added for the /api/v3/write_lp endpoint.
The legacy endpoints for writing can take the parameter, but won't do
anything with it at all.

Closes #25597
mgattozzi added a commit that referenced this issue Jan 24, 2025
In cases where a user does not need the guarantees that data is
persisted to the WAL on write and needs faster ingest speed then the
no_sync param added in this commit are what they need. Rather
than waiting on a sync to the WAL we write to the buffer without
confirming that writes have made it to the WAL.

The upside to this is that they can ingest data faster, but there is a
small risk that between writing the data and it eventually being written
to object storage, that the server crashes and it's irrevocably lost.
Also if the write to the WAL fails, then at most the user will not get a
failed response code they can handle. The data will still be in the
buffer, but will not be durable until persisted as a parquet file in
this case.

However, in many cases that might be acceptable. This commit expands on
what's possible so that the user can use InfluxDB Core the way that
works best for their workload.

Note that this feature is only added for the /api/v3/write_lp endpoint.
The legacy endpoints for writing can take the parameter, but won't do
anything with it at all.

Closes #25597
mgattozzi added a commit that referenced this issue Jan 24, 2025
In cases where a user does not need the guarantees that data is
persisted to the WAL on write and needs faster ingest speed then the
no_sync param added in this commit are what they need. Rather
than waiting on a sync to the WAL we write to the buffer without
confirming that writes have made it to the WAL.

The upside to this is that they can ingest data faster, but there is a
small risk that between writing the data and it eventually being written
to object storage, that the server crashes and it's irrevocably lost.
Also if the write to the WAL fails, then at most the user will not get a
failed response code they can handle. The data will still be in the
buffer, but will not be durable until persisted as a parquet file in
this case.

However, in many cases that might be acceptable. This commit expands on
what's possible so that the user can use InfluxDB Core the way that
works best for their workload.

Note that this feature is only added for the /api/v3/write_lp endpoint.
The legacy endpoints for writing can take the parameter, but won't do
anything with it at all.

Closes #25597
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants