-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: fix left semi and left anti virtual lookup joins #92713
Conversation
For context, #76697 is where we added the support for semi, anti, and left outer joins in the optimizer. Feedback on the release note would be especially helpful to make it more useful for the end users to understand if they are affected. Also I took the liberty and left a TODO for Marcus, I hope you don't mind 😃 |
idea for a release note:
I don't think we need too many details, since the release note in the docs will link to the PR for anyone who is interested in more. |
ceefca6
to
8d83fa9
Compare
This commit fixes the execution of the left semi and left anti virtual lookup joins. The bug was that we forgot to project away the looked up columns (coming from the "right" side) which could then lead to wrong columns being used higher up the tree. The bug was introduced during 22.1 release cycle where we added the optimizer support for generating plans that could contain left semi and left anti virtual lookup joins. This commit fixes that issue as well as the output columns of such joins (I'm not sure whether there is a user facing impact of having incorrect "output columns"). Additionally, this commit fixes the execution of these virtual lookup joins to correctly return the input row only once. Previously, for left anti joins we'd be producing an output row if there was a match (which is wrong), and for both left semi and left anti we would emit an output row every time there was a match (but this should be done only once). (Although I'm not sure whether it is possible for virtual indexes to result in multiple looked up rows.) Also, as a minor simplification this commit makes it so that the output rows are not added into the row container for left semi and left anti and the container is not instantiated at all. Release note (bug fix): CockroachDB previously could incorrectly evaluate queries that performed left semi and left anti "virtual lookup" joins on tables in `pg_catalog` or `information_schema`. These join types can be planned when a subquery is used inside of a filter condition. The bug was introduced in 22.1.0 and is now fixed.
8d83fa9
to
f58e248
Compare
Included the fix to #88096 as well as the release note as Rafi suggested (thanks!). RFAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're here, I just wrote a head-scratching query while trying to demo something to @dikshant. This query should fail with the below error, but instead it succeeds. Wanna add it to this logictest with the expected error?
ERROR: column "oid" does not exist
SQLSTATE: 42703
SELECT *
FROM pg_class
WHERE oid
IN (
SELECT oid
FROM pg_index
WHERE indrelid
IN (SELECT oid FROM pg_class WHERE relname = 'foo')
);
edit: the above query is weird and I'm wrong.
I'm missing something - why is an error expected? Seems like that query works fine on postgres. |
Oh gosh correlated subqueries are confusing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Nice fix!
Reviewed 3 of 4 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner)
TFTR! bors r+ |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from f58e248 to blathers/backport-release-22.1-92713: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. error creating merge commit from f58e248 to blathers/backport-release-22.2-92713: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
@@ -292,6 +292,7 @@ func (c *CustomFuncs) GenerateLookupJoins( | |||
// | |||
// It should be possible to support semi- and anti- joins. Left joins may be | |||
// possible with additional complexity. | |||
// TODO(mgartner): update this comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Though this has nothing to do with lookup joins on virtual tables - this rule is for lookup joins involving virtual computed columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed here: #93042
I don't think that PR is related - it has to do with virtual computed columns, not virtual indexes. So this bug might be present in versions older than 22.1. |
Thanks Marcus for the comments. Indeed, I think you're right - I tried reproductions from both issues on 21.2.17, and both resulted in incorrect output. Now I believe that the bugs date back to 20.2 (#48226 which introduced the support for virtual lookup joins) since the corresponding optimizer changes were merged in 2019 (#38625). Is it worth opening a docs issue to update the release note? |
@yuzefovich Thanks for trying the older version. Yes, let's open a docs issue to get the correct version that the bug was introduced in the release note. |
Filed cockroachdb/docs#15753 and cockroachdb/docs#15755. |
Epic: CRDB-23454
This commit fixes the execution of the left semi and left anti virtual
lookup joins. The bug was that we forgot to project away the looked up
columns (coming from the "right" side) which could then lead to wrong
columns being used higher up the tree. The bug was introduced during
22.1 release cycle where we added the optimizer support for generating
plans that could contain left semi and left anti virtual lookup joins.
This commit fixes that issue as well as the output columns of such joins
(I'm not sure whether there is a user facing impact of having incorrect
"output columns").
Additionally, this commit fixes the execution of these virtual lookup
joins to correctly return the input row only once. Previously, for left
anti joins we'd be producing an output row if there was a match (which
is wrong), and for both left semi and left anti we would emit an output
row every time there was a match (but this should be done only once).
(Although I'm not sure whether it is possible for virtual indexes to
result in multiple looked up rows.)
Also, as a minor simplification this commit makes it so that the output
rows are not added into the row container for left semi and left anti
and the container is not instantiated at all.
Fixes: #91012.
Fixes: #88096.
Release note (bug fix): CockroachDB previously could incorrectly
evaluate queries that performed left semi and left anti "virtual lookup"
joins on tables in
pg_catalog
orinformation_schema
. These join typescan be planned when a subquery is used inside of a filter condition. The
bug was introduced in 20.2.0 and is now fixed.