Skip to content

Commit

Permalink
Fix handling of consecutive sentinel values
Browse files Browse the repository at this point in the history
Summary:
For partial indexes, it is possible to have consecutive sentinel values, and the code to skip over sentinel values during iteration does not handle this correctly.

eg. If we have have the following data:
```
a b
1 1 (sentinel)
1 2 (sentinel)
1 2 1
1 2 2
```
`get_prefix_from_start` returns `(1, 2)` as the group we expect to see (since `(1, 1)` group is empty on the primary key). A seek to `a >=1` on the secondary key would land on `(1, 1)`, but we skip this key since we detect that it's a sentinel value. The cursor is then on `(1, 2)` which matches the expected prefix, and we start reading from the SK from that position. However, this is also a sentinel value, and will not decode correctly. This leads to these errors:

```
ERROR 1296 (HY000): Got error 505 'Found data corruption.' from ROCKSDB
```

The fix is to use a while loop to skip over sentinel values. There are two other places where we skip over sentinel values, but they are not problematic since they are used for iterating over a non-empty groups (so consecutive sentinel values are not possible). I decided not to add while loops defensively in the other places, so that we fail loudly if something unexpected happens.

Ideally, we should just seek directly to the first group that we got from the PK (which guarantees a non-empty group, similar to other cases), but this is a more involved change for later.

Differential Revision: D46352747
  • Loading branch information
Manuel Ung authored and inikep committed May 21, 2024
1 parent 3ffa022 commit ca00a4e
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 1 deletion.
1 change: 1 addition & 0 deletions mysql-test/suite/rocksdb/r/partial_index.result
Original file line number Diff line number Diff line change
Expand Up @@ -231,5 +231,6 @@ rocksdb_partial_index_rows_sorted 1
include/assert.inc [Check that materialized groups are zero.]
include/assert.inc [Check that materialized rows are zero.]
DROP TABLE t1, t2;
DELETE FROM sentinel where c2 <= 2;
DROP TABLE sentinel;
set optimizer_force_index_for_range = off;
9 changes: 9 additions & 0 deletions mysql-test/suite/rocksdb/t/partial_index.test
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,15 @@ SELECT variable_name, t2.variable_value - t1.variable_value AS diff FROM t1 JOIN

DROP TABLE t1, t2;

#
# Test consecutive sentinel values
#
DELETE FROM sentinel where c2 <= 2;

--let $query1= SELECT c1, c2 FROM sentinel FORCE INDEX(c1);
--let $query2= SELECT c1, c2 FROM sentinel FORCE INDEX(PRIMARY);
--source include/diff_queries.inc

DROP TABLE sentinel;

set optimizer_force_index_for_range = off;
4 changes: 4 additions & 0 deletions mysql-test/suite/rocksdb/t/partial_index_link.inc
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ while ($i < 10) {
inc $i;
}

--let $query1= SELECT DISTINCT link_type FROM link_table FORCE INDEX (id1_type)
--let $query2= SELECT DISTINCT link_type FROM link_table FORCE INDEX (id1_type2)
--source include/diff_queries.inc

# Rebuild the table so that nothing is materialized anymore.
ALTER TABLE link_table ENGINE=ROCKSDB;

Expand Down
5 changes: 4 additions & 1 deletion storage/rocksdb/rdb_iterator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -854,7 +854,10 @@ int Rdb_iterator_partial::seek(enum ha_rkey_function find_flag,
rc = Rdb_iterator_base::seek(find_flag, start_key, true, end_key,
read_current);

if (rc == 0 && Rdb_iterator_base::key().size() == m_cur_prefix_key_len) {
while (rc == 0 && Rdb_iterator_base::key().size() == m_cur_prefix_key_len) {
if (thd_killed(m_thd)) {
return HA_ERR_QUERY_INTERRUPTED;
}
rc = direction ? Rdb_iterator_base::next() : Rdb_iterator_base::prev();
}

Expand Down

0 comments on commit ca00a4e

Please sign in to comment.