Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spanconfigccl: full translation is O(Databases * Descriptors) #90655

Open
ajwerner opened this issue Oct 25, 2022 · 6 comments
Open

spanconfigccl: full translation is O(Databases * Descriptors) #90655

ajwerner opened this issue Oct 25, 2022 · 6 comments
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@ajwerner
Copy link
Contributor

ajwerner commented Oct 25, 2022

Describe the problem

When we go to translate a database, we fetch all the descriptors. We have to do this because we have no more efficient way to find the descriptors in the database. Fundamentally, we need to discover dropped descriptors. Dropped descriptors do not have namespace entries.

Additional context
Relates to #26476 and #73277

Jira issue: CRDB-20872

Epic CRDB-24134

@ajwerner ajwerner added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Oct 25, 2022
@blathers-crl blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Oct 25, 2022
@ajwerner ajwerner changed the title spanconfigccl: full transaction is O(Databases * Descriptors) spanconfigccl: full translation is O(Databases * Descriptors) Oct 25, 2022
@ajwerner
Copy link
Contributor Author

ajwerner commented Nov 1, 2022

I think the first thing I'd do to make this cheaper is to have a way to find all of the descriptors for a database (including dropped ones) in catkv by peeking into the descriptor proto and skipping it if if it is not part of the database we're interested in. This is just going to skip some expensive unmarshaling and validation.

@ajwerner
Copy link
Contributor Author

ajwerner commented Nov 1, 2022

Another approach is to cache the ID->parent ID mapping so that we can skip decoding the bytes altogether.

@ajwerner
Copy link
Contributor Author

ajwerner commented Nov 1, 2022

Another important note is that we wouldn't do this full translation very often if we consulted a checkpoint on resume. Right now any time the job restarts, we do a full translate. #73694

@postamar postamar added C-performance Perf of queries or internals. Solution not expected to change functional behavior. and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Nov 10, 2022
@knz
Copy link
Contributor

knz commented Dec 19, 2022

how do you feel about a special/hidden "trash" database?

@ajwerner
Copy link
Contributor Author

Conceptually I'm not opposed. We'd need a new naming scheme given the space for name collisions. There are details to sort out also regarding how it pertains to schemas. Fundamentally such an approach is fine, it's just a non-trivial project.

@ajwerner
Copy link
Contributor Author

Also, for better or for worse (probably somewhat for better?) the zone configs of a dropped table continue to mirror that of the parent database when dropping just a table or index or what not. We may not want to break that.

@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
None yet
Development

No branches or pull requests

3 participants