Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow write through to parquet cache #25887

Open
hiltontj opened this issue Jan 21, 2025 · 0 comments · May be fixed by #25904
Open

Allow write through to parquet cache #25887

hiltontj opened this issue Jan 21, 2025 · 0 comments · May be fixed by #25904
Labels

Comments

@hiltontj
Copy link
Contributor

Problem statement

The parquet cache currently only has one way to populate, which is via a GET request to the object store. See:

/// A request to fetch an item at the given `path` from an object store
///
/// Contains a notifier to notify the caller that registers the cache request when the item
/// has been cached successfully (or if the cache request failed in some way)
#[derive(Debug)]
pub struct CacheRequest {
path: Path,
notifier: oneshot::Sender<()>,
}

This fetch-based method of populating the cache is still needed, but seems like an inefficient option where it is used from the write buffer.

Currently, during the snapshot process, once a parquet file is persisted, we submit a cache request, which will fetch the parquet data that was just written to object store. We should be able to cache the written bytes directly, vs. having to do this additional request to the object store.

Proposed solution

Expand the CacheRequest type into an enum with variants to support:

  • Fetch-based cache request (what it does currently)
  • Write-through cache request

The latter will accept bytes, somehow, and write them into the cache for a given object store path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant