-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.2: sql: fix left semi and left anti virtual lookup joins #92880
release-22.2: sql: fix left semi and left anti virtual lookup joins #92880
Conversation
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a couple of questions / suggestions for tests that you could consider adding in a follow-up PR (since this is a backport).
Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @msirek and @yuzefovich)
pkg/sql/virtual_table.go
line 342 at r1 (raw file):
return true, nil } case descpb.LeftSemiJoin:
Do you have any tests that exercise this case for semi join? (e.g., with your previous fix before the collab session, maybe a semi join would have returned multiple results?)
pkg/sql/logictest/testdata/logic_test/pg_catalog
line 4693 at r1 (raw file):
AND t.relname = 't91012' AND a.atttypid IN (SELECT oid FROM pg_type WHERE typname = ANY (ARRAY['int8']));
could we construct a similar case for anti join?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I opened up #92937 and will squash it into the backports.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @msirek and @rytaft)
pkg/sql/virtual_table.go
line 342 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Do you have any tests that exercise this case for semi join? (e.g., with your previous fix before the collab session, maybe a semi join would have returned multiple results?)
I doubt it. I spent some time trying to come up with cases where multiple matches are looked up in the virtual index but didn't succeed (the difficulty is that we have a few virtual indexes in general and most of them appear unique, then for those that aren't unique (like crdb_internal.jobs (status)
and crdb_internal.cluster_database_privileges (database_name)
) I couldn't come up with a query that would get left semi join).
pkg/sql/logictest/testdata/logic_test/pg_catalog
line 4693 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
could we construct a similar case for anti join?
Done in #92937.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @msirek)
pkg/sql/virtual_table.go
line 342 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I doubt it. I spent some time trying to come up with cases where multiple matches are looked up in the virtual index but didn't succeed (the difficulty is that we have a few virtual indexes in general and most of them appear unique, then for those that aren't unique (like
crdb_internal.jobs (status)
andcrdb_internal.cluster_database_privileges (database_name)
) I couldn't come up with a query that would get left semi join).
Ok no worries
This commit fixes the execution of the left semi and left anti virtual lookup joins. The bug was that we forgot to project away the looked up columns (coming from the "right" side) which could then lead to wrong columns being used higher up the tree. The bug was introduced during 22.1 release cycle where we added the optimizer support for generating plans that could contain left semi and left anti virtual lookup joins. This commit fixes that issue as well as the output columns of such joins (I'm not sure whether there is a user facing impact of having incorrect "output columns"). Additionally, this commit fixes the execution of these virtual lookup joins to correctly return the input row only once. Previously, for left anti joins we'd be producing an output row if there was a match (which is wrong), and for both left semi and left anti we would emit an output row every time there was a match (but this should be done only once). (Although I'm not sure whether it is possible for virtual indexes to result in multiple looked up rows.) Also, as a minor simplification this commit makes it so that the output rows are not added into the row container for left semi and left anti and the container is not instantiated at all. Release note (bug fix): CockroachDB previously could incorrectly evaluate queries that performed left semi and left anti "virtual lookup" joins on tables in `pg_catalog` or `information_schema`. These join types can be planned when a subquery is used inside of a filter condition. The bug was introduced in 22.1.0 and is now fixed.
8903ae2
to
878c29c
Compare
Backport 1/1 commits from #92713.
/cc @cockroachdb/release
This commit fixes the execution of the left semi and left anti virtual
lookup joins. The bug was that we forgot to project away the looked up
columns (coming from the "right" side) which could then lead to wrong
columns being used higher up the tree. The bug was introduced during
22.1 release cycle where we added the optimizer support for generating
plans that could contain left semi and left anti virtual lookup joins.
This commit fixes that issue as well as the output columns of such joins
(I'm not sure whether there is a user facing impact of having incorrect
"output columns").
Additionally, this commit fixes the execution of these virtual lookup
joins to correctly return the input row only once. Previously, for left
anti joins we'd be producing an output row if there was a match (which
is wrong), and for both left semi and left anti we would emit an output
row every time there was a match (but this should be done only once).
(Although I'm not sure whether it is possible for virtual indexes to
result in multiple looked up rows.)
Also, as a minor simplification this commit makes it so that the output
rows are not added into the row container for left semi and left anti
and the container is not instantiated at all.
Fixes: #91012.
Fixes: #88096.
Release note (bug fix): CockroachDB previously could incorrectly
evaluate queries that performed left semi and left anti "virtual lookup"
joins on tables in
pg_catalog
orinformation_schema
. These join typescan be planned when a subquery is used inside of a filter condition. The
bug was introduced in 22.1.0 and is now fixed.
Release justification: bug fix.