-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: inverted-index accelerate filters of the form j->0 @> '{"b": "c"} #96202
Conversation
d73363c
to
a7b8179
Compare
a7b8179
to
b3e0323
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was quick! 🙂
Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @Shivs11)
-- commits
line 20 at r1:
nit: also mention <@ (You've added support for both "contains" and "contained by")
pkg/sql/logictest/testdata/logic_test/inverted_index
line 846 at r1 (raw file):
SELECT j FROM f@i WHERE j->0->1 @> '{"a": {"b": []}}' ---- [[{"a": {"b": []}}]]
This doesn't look right, since there is no element at j->0->1
. Maybe you need to mark this expression as not tight?
pkg/sql/opt/xform/testdata/rules/select
line 2928 at r1 (raw file):
# Generate an inverted scan when the index of the fetch val operator # is an integer and there is a containment operator with a JSON object
nit: end comment with period (here and below)
pkg/sql/opt/xform/testdata/rules/select
line 2972 at r1 (raw file):
# Generate an inverted scan when right side of the equality is an array. opt expect=GenerateInvertedIndexScans SELECT k FROM b WHERE j->'a' = '["b"]'
Did you mean to get rid of this test case and the one below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick review as well!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/logictest/testdata/logic_test/inverted_index
line 846 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
This doesn't look right, since there is no element at
j->0->1
. Maybe you need to mark this expression as not tight?
Had a quick question about this - what does it mean by having no element at j->0->1
? In my understanding, the answer given by the query, [[{"a": {"b": []}}]]
, satisfies the filter of j->0->1 @> '{"a": {"b": []}}'
right?
pkg/sql/opt/xform/testdata/rules/select
line 2972 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Did you mean to get rid of this test case and the one below?
I thought these test cases would be testing filters of the form: j->0 @> '"a"' right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/opt/xform/testdata/rules/select
line 2972 at r1 (raw file):
Previously, Shivs11 (Shivam) wrote…
I thought these test cases would be testing filters of the form: j->0 @> '"a"' right?
Ah, I seem to have misunderstood your question. I noticed that it has been removed, due to some reason.
Shall add them back since I think they are pretty important cases to consider. Thank you for the catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @Shivs11)
pkg/sql/logictest/testdata/logic_test/inverted_index
line 846 at r1 (raw file):
Previously, Shivs11 (Shivam) wrote…
Had a quick question about this - what does it mean by having no element at
j->0->1
? In my understanding, the answer given by the query,[[{"a": {"b": []}}]]
, satisfies the filter ofj->0->1 @> '{"a": {"b": []}}'
right?
I think it satisfies j->0->0 @> '{"a": {"b": []}}'
pkg/sql/opt/xform/testdata/rules/select
line 2972 at r1 (raw file):
Previously, Shivs11 (Shivam) wrote…
I thought these test cases would be testing filters of the form: j->0 @> '"a"' right?
The new cases you added are good, but you removed some old ones (this one was j->'a' = '["b"]'
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/logictest/testdata/logic_test/inverted_index
line 846 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I think it satisfies
j->0->0 @> '{"a": {"b": []}}'
Oh yeah, my apologies it is indeed j->0->0
and not j->0->1
. So in an ideal setting, the output of this query should be no rows right? The fact that this outputs an invalid row is concerning....
pkg/sql/opt/xform/testdata/rules/select
line 2928 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: end comment with period (here and below)
Done!
pkg/sql/opt/xform/testdata/rules/select
line 2972 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
The new cases you added are good, but you removed some old ones (this one was
j->'a' = '["b"]'
)
Yep, I see the issue now. I have added them back and shall be pushing them right now. Thank you and sorry!
b3e0323
to
208fba2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
Previously, rytaft (Rebecca Taft) wrote…
nit: also mention <@ (You've added support for both "contains" and "contained by")
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
a discussion (no related file):
I just realized that I have forgotten to add code which shall set the generated inverted_expr
to tight if keys are of the form of DInt
. Thank you @rytaft for giving me that hint when a test case failed in one of my earlier submissions. Shall be adding code to now account for that!
208fba2
to
7ce0bca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just one minor nit below. Nice work!
Reviewed 4 of 4 files at r3, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @rytaft, and @Shivs11)
pkg/sql/logictest/testdata/logic_test/inverted_index
line 846 at r1 (raw file):
Previously, Shivs11 (Shivam) wrote…
Oh yeah, my apologies it is indeed
j->0->0
and notj->0->1
. So in an ideal setting, the output of this query should be no rows right? The fact that this outputs an invalid row is concerning....
Looks correct now
pkg/sql/opt/invertedidx/json_array_test.go
line 603 at r3 (raw file):
tight: false, unique: true, remainingFilters: `j->0 @> '[1,2]'`,
This is a duplicate test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/opt/invertedidx/json_array_test.go
line 603 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
This is a duplicate test case
Removed it in the latest push! Thank you!
7ce0bca
to
6a3c3e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r1, 3 of 4 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @Shivs11)
pkg/sql/opt/invertedidx/json_array_test.go
line 566 at r4 (raw file):
}, { filters: "j->0 @> '[1,2]'",
nit: A single element array might be a good test, like `j->0 @> '[1]'.
bors r+ |
Build failed (retrying...): |
bors r- See failures in Essential CI. |
Canceled. |
Previously, the optimizer did not plan inverted index scans for filters having an integer as the index for the fetch value in a filter alongside the "contains" or the "contained by" operator. To address this, we now build JSON arrays from fetch value expressions with integer indexes. From these JSON arrays, inverted spans are built for constraining scans over inverted indexes. With these changes chains of both integer and string fetch value operators are now supported alongside the "contains" and the "contained by" operators. (e.g., j->0 @> '{"b": "c"}' and j->0 <@ '{"b": "c"}'). Epic: CRDB-3301 Fixes: cockroachdb#94667 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->) with integer indices alongside the "contains" or the "contained by" operators, e.g, json_col->0 @> '{"b": "c"}' or json_col->0 <@ '{"b": "c"}'
6a3c3e3
to
3ed8616
Compare
bors r+ |
95622: backupccl,storage: add logs around manifest handling and ExportRequest pagination r=stevendanna a=adityamaru backupccl: add logging to backup manifest handling Release note: None storage: log the ExportRequest pagination reason Release note: None Epic: None 95865: cmd/roachtest: adapt disk-stall detection roachtest r=nicktrav,erikgrinaker a=jbowens Move the existing disk-stall/* roachtests under disk-stall/fuse/* (for the FUSE filesystem approach to stalling) and skip them for now. Currently, they're not capable of stalling the disk longer 50us (see #95886), which makes them unreliable at exercising stalls. Add two new roachtests, disk-stall/dmsetup and disk-stall/cgroup that use dmsetup and cgroup bandwidth restrctions respectively to reliably induce a write stall for an indefinite duration. Informs #94373. Epic: None Release note: None 95999: multitenant: add multitenant/shared-process/basic roachtest r=stevendanna a=msbutler This patch introduces a simple roachtest that runs in a shared-process tenant. This test imports a 500 tpcc workload (about 30 GB of replicated data), and runs the workload for 10 minutes. The test is run on a 4 node, 4vcpu cluster with local ssds. A future patch could complicate the test by running schema changes or other bulk operations. Fixes #95990 Release note: None 96115: schemachanger: Implement `DROP CONSTRAINT` in declarative schema changer r=Xiang-Gu a=Xiang-Gu This PR implements `ALTER TABLE t DROP CONSTRAINT cons_name` in declarative schema changer. Supported constraints include Checks, FK, and UniqueWithoutIndex. Dropping PK or Unique constraints will fall back to legacy schema changer, which in turn spits out an "not supported yet" error. Epic: None 96202: opt: inverted-index accelerate filters of the form j->0 @> '{"b": "c"} r=Shivs11 a=Shivs11 Previously, the optimizer did not plan inverted index scans for filters having an integer as the index for the fetch value in a filter alongside the "contains" or the "contained by" operator. To address this, we now build JSON arrays from fetch value expressions with integer indexes. From these JSON arrays, inverted spans are built for constraining scans over inverted indexes. With these changes chains of both integer and string fetch value operators are now supported alongside the "contains" and the "contained by" operators. (e.g., j->0 `@>` '{"b": "c"}' and j->0 <@ '{"b": "c"}'). Epic: [CRDB-3301](https://cockroachlabs.atlassian.net/browse/CRDB-3301) Fixes: #94667 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->) with integer indices alongside the "contains" or the "contained by" operators, e.g, json_col->0 `@>` '{"b": "c"}' or json_col->0 <@ '{"b": "c"}' 96235: sem/tree: add support for producing vectorized data from strings r=cucaroach a=cucaroach tree.ValueHandler exposes raw machine type hooks that are used by vec_handler to build coldata.Vec's. Epic: CRDB-18892 Informs: #91831 Release note: None 96328: udf: allow strict UDF with no arguments r=DrewKimball a=DrewKimball This patch fixes the case when a strict UDF (returns null on null input) has no arguments. Previously, attempting to call such a function would result in `ERROR: reflect: call of reflect.Value.Pointer on zero Value`. Fixes #96326 Release note: None 96366: release: skip nil GitHub events r=celiala a=rail Previously, we referenced `*event.Event`, but in some cases the event objects are `nil`. This PR skips the nil GitHub event objects. Epic: none Release note: None Co-authored-by: adityamaru <[email protected]> Co-authored-by: Jackson Owens <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Xiang Gu <[email protected]> Co-authored-by: Shivam Saraf <[email protected]> Co-authored-by: Tommy Reilly <[email protected]> Co-authored-by: Drew Kimball <[email protected]> Co-authored-by: Rail Aliiev <[email protected]>
Build failed (retrying...): |
Build succeeded: |
Previously, the optimizer did not plan inverted index scans for filters
having an integer as the index for the fetch value in a filter alongside
the "contains" or the "contained by" operator.
To address this, we now build JSON arrays from fetch value expressions
with integer indexes. From these JSON arrays, inverted spans are built
for constraining scans over inverted indexes. With these changes chains
of both integer and string fetch value operators are now supported
alongside the "contains" and the "contained by" operators.
(e.g., j->0 @> '{"b": "c"}' and j->0 <@ '{"b": "c"}').
Epic: CRDB-3301
Fixes: #94667
Release note (performance improvement): The optimizer now plans
inverted index scans for queries that filter by JSON fetch value
operators (->) with integer indices alongside the "contains" or
the "contained by" operators, e.g, json_col->0 @> '{"b": "c"}'
or json_col->0 <@ '{"b": "c"}'