-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: add rule to normalize Like to Range #53536
Conversation
It is easier to calculate stats for a `Range` expression than for `Like` expressions. This PR adds a rule that converts a `Like` operator with tight constraints to a `Range` operator. Fixes cockroachdb#52153 Release note: None Release justification: This change is small and is unlikely to break anything.
Isaac Pugh seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! This code is nice and clean and well-documented.
I'm not sure this change is fixing any problem right now, though. The stats that are visible haven't changed (I added comments below in the test files where the stats are already shown). You could see what the stats look like in the other files by adding the format=show-stats
directive next to opt
, or add some more tests in the memo/testdata/stats
directory. My guess is that you won't see any change in the calculated stats for the other cases either.
This is because if you look at the test files that changed, all of the constraints calculated stayed the same. Since the statisticsBuilder
uses the constraints to estimate stats, it doesn't seem like this change is actually making it easier for the statisticsBuilder
. The cases this rule targets, when there are already tight constraints, is the easiest case for the statisticsBuilder
. @RaduBerinde did you have something else in mind when you created the issue?
Reviewed 10 of 10 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ipugh)
pkg/sql/opt/memo/testdata/stats/select, line 320 at r1 (raw file):
select ├── columns: d_id:1(int!null) d_w_id:2(int!null) d_name:3(string!null) ├── stats: [rows=0.91, distinct(1)=0.91, null(1)=0, distinct(3)=0.91, null(3)=0, distinct(1,3)=0.91, null(1,3)=0]
stats didn't change here
pkg/sql/opt/memo/testdata/stats_quality/tpch/q20, line 109 at r1 (raw file):
│ │ ├── lookup columns are key │ │ ├── immutable │ │ ├── stats: [rows=124.917659, distinct(14)=124.917659, null(14)=0, distinct(15)=124.917659, null(15)=0]
stats didn't change here
I didn't know we create constraints in this case.. Yeah it doesn't sound like this rule helps, sorry about that. |
It is easier to calculate stats for a
Range
expression than forLike
expressions. This PR adds a rule that converts aLike
operatorwith tight constraints to a
Range
operator.Fixes #52153
Release note: None
Release justification: This change is small and is unlikely to break
anything.