Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Incorrect result for LIKE query #44123

Closed
mrigger opened this issue Jan 18, 2020 · 5 comments · Fixed by #73978
Closed

sql: Incorrect result for LIKE query #44123

mrigger opened this issue Jan 18, 2020 · 5 comments · Fixed by #73978
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-quick-win Likely to be a quick win for someone experienced. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-sql-queries SQL Queries Team

Comments

@mrigger
Copy link

mrigger commented Jan 18, 2020

Consider the following test case:

CREATE TABLE t0(c0 STRING UNIQUE);
INSERT INTO t0 (c0) VALUES ('\a');
SELECT * FROM t0 WHERE c0 LIKE '\a'; -- unexpected: row is fetched

Unexpectedly, the query fetches a row. When removing the UNIQUE constraint, no row is fetched. That the predicate should evaluate to FALSE is also demonstrated by the following query:

SELECT c0 LIKE '\a' FROM t0; -- FALSE

I built CockroachDB from source (commit 55c0e012b9539a327640e42ceaaf556b7a0e7b10) on Ubuntu 19.04.

@mrigger
Copy link
Author

mrigger commented Jan 18, 2020

It might also be that

SELECT c0 LIKE '\a' FROM t0;

should yield TRUE, which is, for example, the behavior in SQLite3.

@maddyblue maddyblue changed the title Incorrect result for LIKE query sql: Incorrect result for LIKE query Jan 18, 2020
@maddyblue maddyblue added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jan 18, 2020
@RaduBerinde
Copy link
Member

The bug is on the planning side, but we would need some guidance from someone familiar with the semantics (and the relevant code in sem/tree/eval.go, which seems pretty complex).

@RaduBerinde
Copy link
Member

RaduBerinde commented Jan 18, 2020

We don't handle escaping at all in the pattern when we create index constraints from a LIKE expression. There are many other bugs here, e.g. '\' is not LIKE '\%' but it would get returned with an index. Fortunately it should be easy to write a targeted fuzz test that cross-checks queries with and without using the index.

@solongordon solongordon removed their assignment Feb 4, 2020
@yuzefovich yuzefovich added the E-quick-win Likely to be a quick win for someone experienced. label May 12, 2021
@rytaft rytaft added the S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. label Jun 7, 2021
@cucaroach cucaroach self-assigned this Jun 14, 2021
@cucaroach
Copy link
Contributor

cucaroach commented Jun 15, 2021

Seems like there's more going on here that meets the eye. We always get the right answer if we use LIKE '\a' ESCAPE '', ie this forces the escaping to take place and things work. If we use the row engine and comment out the opt.LikeOp handling in index_constraints then things work (w/ and w/o UNIQUE). If I use the vector engine it only breaks when there's UNIQUE constraint.

So I think there's two issues:

  • when converting a LIKE right hand side to a filter we need to process escapes, since the execution engines will reduce cases like this to an equality, this needs to happen in index_constraint and in the non-optimized generic filter case. I think the logic we want is contained in tree.unescapePattern this needs to be added to colbuilder.planProjectionExpr and index_constraints

  • with the vectorized engine the UNIQUE constraint enables the index_constraints span optimization not sure why

@cucaroach cucaroach linked a pull request Jun 15, 2021 that will close this issue
cucaroach added a commit to cucaroach/cockroach that referenced this issue Jun 15, 2021
@cucaroach cucaroach linked a pull request Jun 15, 2021 that will close this issue
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@piotrb5e3
Copy link

We're still getting this on v21.1.12 and v21.2.2 (docker image distribution).

Removing the index or adding an explicit ESCAPE '\' helps. Our test example:

CREATE TABLE IF NOT EXISTS "prefix_search_test" ("id" serial NOT NULL PRIMARY KEY, "value" STRING NOT NULL);
-- CREATE UNIQUE INDEX "prefix_search_test_value_uniq" ON "prefix_search_test" ("value");  -- Uncommenting breaks results
INSERT INTO "prefix_search_test" ("value") VALUES ('foo_bar1');
SELECT * FROM "prefix_search_test" WHERE ("value"::text LIKE 'foo\_bar%');

cucaroach added a commit to cucaroach/cockroach that referenced this issue Dec 20, 2021
Fixes: cockroachdb#44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.
cucaroach added a commit to cucaroach/cockroach that referenced this issue Dec 20, 2021
Fixes: cockroachdb#44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.
craig bot pushed a commit that referenced this issue Dec 21, 2021
72992: util/log: fix redactability of logging tags r=abarganier a=knz

Fixes #72905.

Some time in the v21.2 cycle, the log entry preparation logic was
refactored and a mistake was introduced: the logging tags were not any
more subject to the redaction logic. The result was that redaction
markers were missing in log tag values, and if a value had contained
unbalanced redaction markers in a value string (say, as part of a SQL
table key), it would have caused log file corruption and possibly a
confidential data leak.

This patch fixes that, by preparing the logging tags in the same way
as the main message for each entry.

Release note (cli change): A bug affecting the redactability of
logging tags in output log entries has been fixed. This bug had
been introduced in the v21.2 release.

73937: setting: introduce setting classes r=RaduBerinde a=RaduBerinde

This commit introduces the three setting classes in the RFC (#73349):
`SystemOnly`, `TenantReadOnly`, and `TenantWritable`. The `SystemOnly`
class replaces the existing `WithSystemOnly()`.

In this change we don't yet implement the advertised semantics. We
mechanically use `TenantWritable` for all settings except those that
were using `WithSystemOnly()` which use `SystemOnly`; this should not
change any existing behavior. The classes will be revisited in a
separate change, after we implement the semantics.

Release note: None

73978: opt: fix like escape processing for span constraints r=cucaroach a=cucaroach

Fixes: #44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.

74102: sql: do not fetch virtual columns during backfill r=mgartner a=mgartner

Fixes #73372

Release note (bug fix): A bug has been fixed that caused internal errors
when altering the primary key of a table. The bug was only present if
the table had a partial index with a predicate that referenced a virtual
computed column. This bug was present since virtual computed columns
were added in version 21.1.0.

74110: bazel: require setting `cockroach_cross=y` to opt into cross toolchains r=irfansharif a=rickystewart

With `--incompatible_enable_cc_toolchain_resolution` set in #73819, now
Bazel selects the appropriate toolchain for you. Bazel was selecting the
`cross_linux_toolchain` when building for the host platform on Linux,
resulting in link errors when trying to compile `stress` under `race`.
We update the toolchains to instead require opting into the cross
toolchains by defining `cockroach_cross=y`.

Closes #73997.

Release note: None

74111: bench/rttanalysis: allow roundtrips to be off by 1 r=ajwerner a=ajwerner

If we don't have a range, let the currently estimate be wrong by 1.
We mostly care about the ballpark and the growth rate. I'm sick of
these flakes.

Fixes #73884.

Release note: None

74152: ci: don't delete `test.json.txt` after processing r=tbg a=rickystewart

We've seen some issues where Bazel jobs are failing in `github-post`
(#73841, #74013) with the following output:

    found outstanding output. Considering last test failed:

It's hard to say what the problem is because these scripts haven't kept
the `test.json.txt` in `artifacts`. Here I remove the logic to clean up
the file so we can RC further instances of the problem.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Tommy Reilly <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
@craig craig bot closed this as completed in d25385b Dec 21, 2021
blathers-crl bot pushed a commit that referenced this issue Dec 23, 2021
Fixes: #44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.
blathers-crl bot pushed a commit that referenced this issue Dec 23, 2021
Fixes: #44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.
cucaroach added a commit that referenced this issue Jan 4, 2022
Fixes: #44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.
cucaroach added a commit that referenced this issue Jan 4, 2022
Fixes: #44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.

Release justification: visible logical error
cucaroach added a commit that referenced this issue Jan 4, 2022
Fixes: #44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.

Release justification: visible logical error
gustasva pushed a commit to gustasva/cockroach that referenced this issue Jan 4, 2022
Fixes: cockroachdb#44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-quick-win Likely to be a quick win for someone experienced. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-sql-queries SQL Queries Team
Projects
Archived in project
9 participants