-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: add SHOW RANGES statements to sql_split_syntax #14366
Conversation
Reviewed 1 of 1 files at r1. docs/RFCS/sql_split_syntax.md, line 188 at r1 (raw file):
Comments from Reviewable |
1e46398
to
6aaae84
Compare
docs/RFCS/sql_split_syntax.md, line 188 at r1 (raw file): Previously, knz (kena) wrote…
Added FROM. Though perhaps Added notes on interleaved tables (the semantics are fairly straightforward). I'll make sure to add some tests with those in the implementation. Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file):
How do we feel about Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file): Previously, mjibson (Matt Jibson) wrote…
I think there is a lot more complexity to doing something like that. For example, there will need to be a bunch of custom code to avoid scanning ALL the ranges when we're only interested in a table. I'm not against it - it might be what we want in the long term - but it is more involved. The proposed If you want to work out the details of your proposal (either as a PR to this RFC or in a new RFC), I would have no problem retiring Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 188 at r1 (raw file): Previously, RaduBerinde wrote…
Why are the Comments from Reviewable |
docs/RFCS/sql_split_syntax.md, line 188 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…
Hm, good question. The one case I can think of where it helps is when you refer to an index by name (without the table). That works today with Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 188 at r1 (raw file): Previously, RaduBerinde wrote…
I could also imagine docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file): Previously, RaduBerinde wrote…
If it's easy to do this as a virtual table instead of a custom docs/RFCS/sql_split_syntax.md, line 202 at r2 (raw file):
Any chance these could be a tuple of the decoded values instead of a pretty-printed string? Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 188 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Does Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
+1 more for virtual table over SHOW Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 202 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
if we move to a virtual table, then I guess these would need to be a SQL array (instead of a tuple)? docs/RFCS/sql_split_syntax.md, line 202 at r2 (raw file):
if I remember correctly, the pretty printing is best effort (it's done without a table descriptor and so not all types can be decoded correctly) docs/RFCS/sql_split_syntax.md, line 206 at r2 (raw file):
NULL if none and/or unknown? I think we should discuss the implementation here too - is the output produced using caches or not? We were discussing have two flavors (with/without caches). What do you think? Comments from Reviewable |
Looks like most people agree that we want a system table. I think that's fine, but will require fleshing out more details and more implementation work which I fear may be a distraction at this point from #13665 and #13666 (which was the motivating factor behind this). My preference would be to go ahead with the SHOW RANGES implementation temporarily so I can progress on the distsql issues, and keep this PR open while we figure out the details of the system table. Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file): Previously, danhhz (Daniel Harrison) wrote…
@bdarnell note that in general, to show the Lease Holder (which is important for distsql planning), we have to issue a We can also add special code that detects a table filter and restrict the range to that. But that brings another question around table name qualification. With SHOW RANGES, we can qualify a bare table name with the current databaase. Having it be part of a filter expression (where presumably the table is just a string) wouldn't allow that, so we would have to always specify the database and the table if we want to avoid the full range scan. docs/RFCS/sql_split_syntax.md, line 202 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Can there be cases where the split key doesn't correspond to a complete value? Technically it could be some key prefix which doesn't really map to a valid value, though I don't know if that ever happens in practice. The other question - what would the resulting schema be? (assuming we switch to a system table). Would these values be part of a single column as a tuple? docs/RFCS/sql_split_syntax.md, line 206 at r2 (raw file): Previously, andreimatei (Andrei Matei) wrote…
The version I am proposing would be without caches. The only reason to have the "with cache" flavor is to debug the cache itself (right?) so including in the syntax will be subject to bikeshedding. One thing to consider is how would this work with a system table? Would we have two system tables? ( Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 202 at r2 (raw file): Previously, RaduBerinde wrote…
@andreimatei do we support ARRAY table columns? (I thought we didn't?) Which would also be a problem for the replicas list. On the other hand, maybe virtual tables don't need to have the same restrictions as real tables. Comments from Reviewable |
Maybe we should call this Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file): Previously, RaduBerinde wrote…
Ah, I didn't realize there was an RPC per range to get the lease; I thought it was just a scan of the meta ranges. Maybe there should be two tables, one for stuff in the range descriptors and one for lease info, and you could join them. But that's probably too much for now. docs/RFCS/sql_split_syntax.md, line 202 at r2 (raw file):
Yeah, good point. It's fine for this to be a pretty-printed string; I was just thinking that if it's easy it might be nice to have the real values. Comments from Reviewable |
Ok - so I will plan to go ahead with a temporary Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
The join thing is a nice idea. It will require improving the infrastructure for system tables to only populate rows needed by filters though. Comments from Reviewable |
This syntax is temporary and it will be replaced with a more versatile system table (see discussion in cockroachdb#14366). This is useful for implementing/testing `TESTING_RELOCATE` before we are able to flesh out all the details of the system table proposals.
#14390 is out for the temporary TESTING_RANGES. |
This syntax is temporary and it will be replaced with a more versatile system table (see discussion in cockroachdb#14366). This is useful for implementing/testing `TESTING_RELOCATE` before we are able to flesh out all the details of the system table proposals.
Review status: 0 of 1 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 197 at r2 (raw file): Previously, RaduBerinde wrote…
I've been thinking about this more. We don't yet support joins by point-lookups (other than index joins) so the entire table with leases would need to be populated before the join. I will go with a single table for now, but I'll make sure that we omit the lease requests if if we don't select for that column. Comments from Reviewable |
6aaae84
to
cf30197
Compare
Updated the RFC with a proposed system table. Review status: 0 of 1 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending. docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file):
@andreimatei - let me know what you are thinking for the range cache information schema (is a debug string sufficient?) Comments from Reviewable |
This syntax is temporary and it will be replaced with a more versatile system table (see discussion in cockroachdb#14366). This is useful for implementing/testing `TESTING_RELOCATE` before we are able to flesh out all the details of the system table proposals.
Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 206 at r2 (raw file): Previously, RaduBerinde wrote…
the reason to have the version using the cache is to debug the cache itself, but also to be able to predict (or perhaps retroactively understand) how a scan is planned by a particular node. docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file): Previously, RaduBerinde wrote…
There's 2 caches to speak of - the leaseholder cache and the range descriptor cache. Ideally we could ask to use none, either, or both. For the leaseholder cache, I think a string field annotating a range coming from wherever it might have come (meta or range desc cache) is enough. docs/RFCS/sql_split_syntax.md, line 212 at r3 (raw file):
I think it might be saner to just return these cols as NULL rather than magically showing or hiding them. This magic might have consequences in how we present the schema of this table and such. Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I see. I am hesitant to introduce a mechanism that "tweaks" a SELECT statement into returning different results for (what looks like) the same table.. I think it would be cleaner to just have a separate table ( docs/RFCS/sql_split_syntax.md, line 212 at r3 (raw file): Previously, andreimatei (Andrei Matei) wrote…
The "magic" is the same thing we use for implicit primary keys. There's not much to it - it mostly restricts the set of columns when resolving I don't get the "return as NULL" suggestion; if values for them are required, we would populate them. Anyway, I don't feel strongly on making them hidden; it's fine if we require explicit SELECT renders if we want to avoid the overhead. Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file): Previously, RaduBerinde wrote…
What's the usecase for using just one of the caches? Don't we always use both in the span resolver? Comments from Reviewable |
docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file): Previously, RaduBerinde wrote…
Would having a Comments from Reviewable |
cf30197
to
09b26b1
Compare
Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions. docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file): Previously, RaduBerinde wrote…
I added info about a second Comments from Reviewable |
09b26b1
to
4b54a81
Compare
docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file): Previously, RaduBerinde wrote…
@andreimatei weekly ping :) Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 8 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 202 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
we could also maybe add a function that takes a rangeid and returns the split key as an array docs/RFCS/sql_split_syntax.md, line 210 at r3 (raw file): Previously, RaduBerinde wrote…
LGTM! docs/RFCS/sql_split_syntax.md, line 212 at r3 (raw file): Previously, RaduBerinde wrote…
whatever you think docs/RFCS/sql_split_syntax.md, line 227 at r4 (raw file):
what do you mean by "split across multiple rows" exactly? We're going to "left inner-join" the range cache with the lease cache by range id, not do anything fancier, right? Comments from Reviewable |
Review status: 0 of 1 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. docs/RFCS/sql_split_syntax.md, line 227 at r4 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Ah, I thought both have ranges; the idea was that if one cache thinks there's a range from A and C and the other thinks there are two ranges from A to B and from B to C, there will be two rows (A->B, B->C) which both show the same info from the first cache. If it only has the range id, then yes you are correct, it's effectively a join by rangeID. I will update the wording. Comments from Reviewable |
While writing tests for `TESTING_RELOCATE`, I found myself implementing a test-only function that returns information for all ranges in a table or index. It seems much more useful to provide it through a statement, and use logic test infrastructure for writing tests.
4b54a81
to
32a8b5c
Compare
While writing tests for
TESTING_RELOCATE
, I found myself implementing atest-only function that returns information for all ranges in a table or index.
It seems much more useful to provide it through a statement, and use logic test
infrastructure for writing tests.
This change isdata:image/s3,"s3://crabby-images/d0bb7/d0bb7f7625ca5bf5c3cf7a2b7a514cf841ab8395" alt="Reviewable"