Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc: add SHOW RANGES statements to sql_split_syntax #14366

Merged
merged 1 commit into from
Apr 13, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 53 additions & 1 deletion docs/RFCS/sql_split_syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,9 @@ order; for example, if there are many splits, it is advantageous to sort the
split points, split at the middle point, then recursively process the left and
right sides (in parallel).

*Interleaved tables*: the command works as expected; the split will inherently
cause a corresponding split in the parent or child tables/indexes.

##### Return values #####

`ALTER TABLE/INDEX SPLIT AT` currently returns a row with two columns: the key
Expand Down Expand Up @@ -138,6 +141,9 @@ ALTER INDEX t@idx SCATTER (1) (2)

The statement returns only after the relocations are complete.

*Interleaved tables*: the command works as expected (the ranges may contain rows
for parent or child tables/indexes).

### 3. `ALTER TABLE/INDEX TESTING_RELOCATE` ###

The `TESTING_RELOCATE` statements can be used to relocate specific ranges to
Expand Down Expand Up @@ -178,7 +184,46 @@ ALTER TABLE t TESTING_RELOCATE SELECT ARRAY[1+i%2], i FROM GENERATE_SERIES(1, 10

The statement returns only after the relocations are complete.

# Drawbacks
*Interleaved tables*: the command works as expected (the ranges may contain rows
for parent or child tables/indexes).

### 4. `crdb_internal.ranges` and `ranges_cached` system table ###

To facilitate testing the implementation of the new commands (as well as allow a
user to verify what the commands did), we introduce a `crdb_internal.ranges`
system table that can be used to look at all the ranges on the system, or the
ranges from a table.

The schema of the table is as follows:

Column | Type | Description
---------------|------------|------------------------------
`start_key` | BYTES | Range start key (raw)
`start_pretty` | STRING | Range start key (pretty-printed)
`end_key` | BYTES | Range end key (raw)
`end_pretty` | STRING | Range end key (pretty-printed)
`database` | STRING | Database name (if range is part of a table)
`table` | STRING | Table name (if range is part of a table); for interleaved tables this is always the root table.
`index` | STRING | Index name (if range is part of a non-primary index);
`replicas` | ARRAY(INT) | Replica store IDs
`lease_holder` | INT | Lease holder store ID

The last two columns could be hidden (so they are only available if `SELECT`ed
for specifically).

Implementation notes:
- the system table infrastructure will be improved so the row producing
function has access to filters; specifying a `table` or `index` filter that
should be optimized to only look at the ranges for that table.
- the row producing function should also have access to needed columns; that
way the more expensive lease holder determination can be omitted if the
column is not needed.

A second table, `crdb_internal.ranges_cached` has the same schema, but it
returns data from the range and lease holder caches. Specifically: the ranges
along with `replicas` information are populated from the range cache; for each
range, if that range ID has an entry in the lease holder cache, `lease_holder`
is set according to that entry; otherwise it is NULL.

# Alternatives

Expand All @@ -192,4 +237,11 @@ splits to happen sequentially; we cannot implement the algorithm mentioned above
that parallelizes the splits. One way around this would be to introduce `split_at`
as an *aggregation* function (akin to `sum`).

Alternatives considered for `crdb_internal.ranges`:
- a `SHOW RANGES FOR TABLE/INDEX` statement; the system table was deemed more
useful.
- having multiple system tables (e.g. a separate one for lease holders) and
using joins as necessary; this requires too many changes to make sure we only
generate the parts of the table that are needed.

# Unresolved questions