Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: fix like escape processing for span constraints #73978

Merged
merged 1 commit into from
Dec 21, 2021

Conversation

cucaroach
Copy link
Contributor

@cucaroach cucaroach commented Dec 17, 2021

Fixes: #44123

Previously no attempt was made to properly handle escape ('\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@cucaroach cucaroach marked this pull request as ready for review December 18, 2021 11:59
@cucaroach cucaroach requested a review from a team as a code owner December 18, 2021 11:59
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @RaduBerinde)


pkg/sql/opt/idxconstraint/index_constraints.go, line 283 at r1 (raw file):

	case opt.LikeOp:
		if s, ok := tree.AsDString(datum); ok {
			// Normalize the like pattern to a RE

nit: spell out "regex" or "regular expression" and add a period.


pkg/sql/logictest/testdata/logic_test/select_index, line 257 at r1 (raw file):


statement ok
INSERT INTO str VALUES (1, 'A'), (4, 'AB'), (2, 'ABC'), (5, 'ABCD'), (3, 'ABCDEZ'), (9, 'ABD'), (10, '\CBA'), (11, 'A%'), (12, 'CAB.*')

nit: insert a value like 'CABD' so that the test below with v LIKE 'CAB.*' shows that only the row with 'CAB.*' is returned.


pkg/sql/opt/idxconstraint/testdata/strings, line 6 at r1 (raw file):

[/'ABC' - /'ABC']

# a backslash that isn't escaping anything is just removed from pattern

nit: treat comments as a sentence here and below with capital first letter and period at the end.


pkg/sql/opt/idxconstraint/testdata/strings, line 8 at r1 (raw file):

# a backslash that isn't escaping anything is just removed from pattern
index-constraints vars=(a string) index=a
a LIKE '\aABC%'

Can you add a test like a LIKE '\aABC_'? I think remaining filters are required in this case.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach)


pkg/sql/opt/memo/constraint_builder.go, line 192 at r1 (raw file):

				c := cb.makeStringPrefixSpan(col, prefix)
				// If it ends in .* its tight.
				return c, strings.HasSuffix(pattern, ".*")

I don't understand this logic. Won't something like AB_C% end in .* but prefix is just AB ?


pkg/sql/opt/idxconstraint/testdata/strings, line 59 at r1 (raw file):

# backslash terminator is okay
index-constraints vars=(a string) index=a
a LIKE 'ABC\'

This should result in an error, no? It does currently:

[email protected]:26257/defaultdb> SELECT 'ABC' LIKE 'ABC\';
ERROR: LIKE regexp compilation failed: LIKE pattern must not end with escape character

If we convert to index constraints and execute, we might return an ABC row with no errors.

@cucaroach cucaroach force-pushed the like-constraint-fix branch 2 times, most recently from 6313ceb to 1f63d36 Compare December 20, 2021 23:30
Copy link
Contributor Author

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)


pkg/sql/opt/idxconstraint/index_constraints.go, line 283 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: spell out "regex" or "regular expression" and add a period.

👍


pkg/sql/opt/memo/constraint_builder.go, line 192 at r1 (raw file):

Previously, RaduBerinde wrote…

I don't understand this logic. Won't something like AB_C% end in .* but prefix is just AB ?

Nice catch, what I want is something stricter, I went with prefix+".*" == pattern but not sure it there's a better way that avoids string construction. Also added a test for this case.


pkg/sql/logictest/testdata/logic_test/select_index, line 257 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: insert a value like 'CABD' so that the test below with v LIKE 'CAB.*' shows that only the row with 'CAB.*' is returned.

👍


pkg/sql/opt/idxconstraint/testdata/strings, line 6 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: treat comments as a sentence here and below with capital first letter and period at the end.

👍


pkg/sql/opt/idxconstraint/testdata/strings, line 8 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you add a test like a LIKE '\aABC_'? I think remaining filters are required in this case.

👍


pkg/sql/opt/idxconstraint/testdata/strings, line 59 at r1 (raw file):

Previously, RaduBerinde wrote…

This should result in an error, no? It does currently:

[email protected]:26257/defaultdb> SELECT 'ABC' LIKE 'ABC\';
ERROR: LIKE regexp compilation failed: LIKE pattern must not end with escape character

If we convert to index constraints and execute, we might return an ABC row with no errors.

Nice catch, fixed and added a test for this.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Nice job!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @mgartner, and @RaduBerinde)


pkg/sql/opt/memo/constraint_builder.go, line 192 at r1 (raw file):

Previously, cucaroach (Tommy Reilly) wrote…

Nice catch, what I want is something stricter, I went with prefix+".*" == pattern but not sure it there's a better way that avoids string construction. Also added a test for this case.

strings.HasSuffix(pattern, ".*") && strings.TrimSuffix(pattern, ".*") == prefix


pkg/sql/opt/memo/constraint_builder.go, line 191 at r2 (raw file):

					}
					c := cb.makeStringPrefixSpan(col, prefix)
					// If pattern is simply prefix + .* its tight.

[nit] it's (or better yet "the span is tight"


pkg/sql/opt/idxconstraint/testdata/strings, line 71 at r2 (raw file):

[/'ABC_Z' - /'ABC_Z']

# Escape terminator is okay for index constraint purposes.

should say "Invalid pattern does not generate index constraints"

@cucaroach cucaroach force-pushed the like-constraint-fix branch 3 times, most recently from 572da7f to 6b8bbfc Compare December 21, 2021 12:36
Fixes: cockroachdb#44123

Previously no attempt was made to properly handle escape ('\\') sequence
in like patterns being turned into constraints. Refactor code used to
process like at runtime to generate a regexp and use that to properly
handle index constraint generation.

Release note (sql change): Escape character processing was missing from
constraint span generation which resulted in incorrect results when
doing escaped like lookups.
Copy link
Contributor Author

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)


pkg/sql/opt/memo/constraint_builder.go, line 191 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] it's (or better yet "the span is tight"

👍

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 5 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)

@cucaroach
Copy link
Contributor Author

bors r+
Thanks for the reviews!

@craig craig bot merged commit 779f08b into cockroachdb:master Dec 21, 2021
@craig
Copy link
Contributor

craig bot commented Dec 21, 2021

Build succeeded:

@rafiss
Copy link
Collaborator

rafiss commented Dec 21, 2021

Thanks for fixing this! Is this eligible for a backport?

@cucaroach
Copy link
Contributor Author

I would wager yes since its a correctness issue but I'm not sure, @jordanlewis?

@jordanlewis
Copy link
Member

Yes, I agree we should backport.

@rafiss
Copy link
Collaborator

rafiss commented Dec 23, 2021

blathers backport 21.1 21.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: Incorrect result for LIKE query
6 participants