-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colexec: add support for Pow binary operator #50143
colexec: add support for Pow binary operator #50143
Conversation
Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. I have added a few people who may be able to assist in reviewing: 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I have some suggestions and some nits as well.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan)
pkg/sql/colexec/execgen/supported_bin_cmp_ops.go, line 25 at r1 (raw file):
tree.Mult: "Mult", tree.Div: "Div", tree.Pow: "Pow",
nit: we're trying to keep this list in the same order as binary operators are defined in tree/expr.go
, so please put it between Mod
and Concat
.
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 141 at r1 (raw file):
// Simple arithmetic binary operators. for _, binOp := range []tree.BinaryOperator{tree.Plus, tree.Minus, tree.Mult, tree.Div, tree.Pow} {
nit: please add it in "Other arithmetic binary operators" section below. That section already has the same input/output types set up.
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 325 at r1 (raw file):
case tree.Mod: computeBinOp = fmt.Sprintf("%s(float64(%s), float64(%s))", binaryOpFloatMethod[binOp], leftElem, rightElem) case tree.Pow:
nit: since tree.Mod
and tree.Pow
cases have the same function, you could combine them, i.e.
case tree.Mod, tree.Pow:
computeBinOp ...
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 395 at r1 (raw file):
`)) case tree.Mult, tree.Pow:
I think we should introduce a separate case for tree.Pow
here.
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 442 at r1 (raw file):
t = template.Must(template.New("").Parse(` { result := Float64({{.Left}}) ** Float64({{.Right}})
Hm, I'm suprised it compiles (does it?).
Please take a look at IntPow
method of tree/eval.go
, we should follow what that method is doing with one difference being that instead of creating new apd.Decimal
s using apd.New
we can reuse already allocated temporary decimals from the overload helper. Take a look at tree.Div
case for an example.
pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 129 at r1 (raw file):
query T EXPLAIN (VEC) SELECT _float ** _float FROM many_types
We should also add a test or two that actually runs the query (i.e. without EXPLAIN (VEC)
).
0de3124
to
5205978
Compare
Thank you for updating your pull request. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
a4e996d
to
a62c02a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan and @yuzefovich)
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 485 at r2 (raw file):
boundType := "{{.boundType}}" if boundType == "Int64" { return resultInt
Note that this code snippet is not a standalone function, rather it is inlined within another function in a loop, take a look at proj_const_left_ops.eg.go
file for an example of how generated code looks like. And because of this, we should not have return
statements in the templated code block.
b530e5d
to
8270903
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an issue with integer types that we need to address, but this is looking very good!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan and @yuzefovich)
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 180 at r3 (raw file):
// Other arithmetic binary operators. for _, binOp := range []tree.BinaryOperator{tree.FloorDiv, tree.Mod, tree.Pow} {
As I commented below, we should be returning types.Int
for integers of all possible widths, so my earlier comment was incorrect - we should add another loop below this one that populates the output types as follows:
float, float -> types.Float
decimal, decimal -> types.Decimal
some_width_integer, possibly_another_width_integer -> types.Int
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 335 at r3 (raw file):
} buf := strings.Builder{}
nit: no need for this newline as well as for the one below.
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 483 at r3 (raw file):
colexecerror.ExpectedError(tree.ErrIntOutOfRange) } boundType := "{{.boundType}}"
In our old, row-by-row engine, there is no additional overflow check after checking whether an error is returned by Int64()
function above, so we should not introduce such check either. Here is some more info:
[email protected]:56750/defaultdb> create table t(i int2);
CREATE TABLE
Time: 2.162ms
[email protected]:56750/defaultdb> insert into t values (32767);
INSERT 1
Time: 3.379ms
[email protected]:56750/defaultdb> select i^2 from t;
?column?
--------------
1073676289
(1 row)
Time: 494µs
[email protected]:56750/defaultdb> select pg_typeof(i^2) from t;
pg_typeof
-------------
bigint
(1 row)
Time: 564µs
As we can see, we always return bigint
(which is basically int64
).
So we need to write a template that operates on integers of all three possible width, yet always return int64
. Then we can eliminate boundType
and convType
parameters from the map as well as this additional overflow check.
8270903
to
ea21f88
Compare
Thanks for the feedback, it's really helpful. I have another question:
Wouldn't we have a case int ^ (-int) result in decimal result? So shouldn't we result in Also regarding intPow in row-by-row engine : isn't there an overflow error returned from intPow function (https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/sem/tree/eval.go#L5018-L5030) |
ea21f88
to
f6ab386
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have last several nits, but it is pretty much ready to go.
Wouldn't we have a case int ^ (-int) result in decimal result?
2^-2 = 0.25
Yes, you're right, in general. Postgres returns double precision
for Pow
operation (i.e. float64
), but our row-by-row engine currently returns types.Int
(i.e. INT8
), and whenever we get a non-integer result, then we will get an error in tree/eval.go:5025
which we will "convert" into ErrIntOutOfRange
. So in the vectorized engine we should return the error as well.
If it's something that you might interested looking into, you could fix this deviation from Postgres and update row-by-row engine, then we will be able to update the vectorized engine as well (currently we're trying to make both engines behave exactly the same, even if the old one behaves incorrectly). I'll file an issue if we don't have already.
Also regarding intPow in row-by-row engine : isn't there an overflow error returned from intPow function
Yes, there is, but the overflow "check" is done by Int64
method of apd.Decimal
object - if an overflow occurs, then we will return ErrIntOutOfRange
error as I described above. However, we do not return an error if the original type was INT2
and we got an integer larger than MaxInt16
because the result is INT8
, so there is no overflow. I hope this makes sense.
Reviewed 1 of 3 files at r2, 1 of 1 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan)
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 191 at r4 (raw file):
} // Pow arithmetic binary operators.
nit: s/operators/operator/
.
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 197 at r4 (raw file):
binOpOutputTypes[binOp][typePair{types.DecimalFamily, anyWidth, types.DecimalFamily, anyWidth}] = types.Decimal for _, intWidth := range supportedWidthsByCanonicalTypeFamily[types.IntFamily] { for _, intWidth2 := range supportedWidthsByCanonicalTypeFamily[types.IntFamily] {
nit: "intWidth2" is somewhat unfortunate name, maybe something like "otherIntWidth" would be better?
pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 137 at r4 (raw file):
query T rowsort SELECT (_float^_float)::STRING FROM many_types
nit: why do we need ::STRING
part? I don't think it is necessary.
f6ab386
to
beceb46
Compare
Thanks for the insight, I would love to take a look! Yes, that does makes sense. So I left off only the error check we do with Also I have been converting tests from
However, if i don't convert the int to a string type I get the following error:
Not sure if it's something to do with the logic_test verification problem or something to do with |
beceb46
to
d09e7a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the insight, I would love to take a look!
I asked internally about that issue, and we might want to keep INT ^ INT
returning an INT
as well for performance reasons. If we decide that the result should be DECIMAL
, I'll open an issue and cc you on it.
Try this diff to resolve the internal error:
diff --git a/pkg/sql/colexec/execplan.go b/pkg/sql/colexec/execplan.go
index 37ad161b95..0b4571cc52 100644
--- a/pkg/sql/colexec/execplan.go
+++ b/pkg/sql/colexec/execplan.go
@@ -1844,15 +1844,16 @@ func planProjectionExpr(
// actualOutputType tracks the logical type of the output column of the
// projection operator. See the comment below for more details.
actualOutputType := outputType
- if outputType.Identical(types.Int) {
+ if outputType.Identical(types.Int) && projOp != tree.Pow {
// Currently, SQL type system does not respect the width of integers
// when figuring out the type of the output of a projection expression
// (for example, INT2 + INT2 will be typed as INT8); however,
// vectorized operators do respect the width when both operands have
- // the same width. In order to go around this limitation, we explicitly
- // check whether output type is INT8, and if so, we override the output
- // physical types to be what the vectorized projection operators will
- // actually output.
+ // the same width (the only exception to this rule currently is Pow
+ // binary operator). In order to go around this limitation, we
+ // explicitly check whether output type is INT8, and if so, we override
+ // the output physical types to be what the vectorized projection
+ // operators will actually output.
//
// Note that in mixed-width scenarios (i.e. INT2 + INT4) the vectorized
// engine will output INT8, so no overriding is needed.
The explanation is that even though in registerBinOpOutputTypes
we set the result of INT2 ^ INT2
to be INT8
, we have the custom logic in execplan.go
that expects that INT2
operating with another INT2
returns INT2
and plans a cast to adapt to the row engine (which expects INT8
). I know this is quite confusing, so if it doesn't make sense, don't worry :)
Reviewed 2 of 2 files at r5.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan)
pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 335 at r5 (raw file):
statement error integer out of range SELECT (_int2^_int2)::STRING FROM many_types
nit: ::STRING
is not necessary.
Actually, the problem with |
That's awesome! Glad I could be of helped. I will wait for your PR to be merged before moving forward with this PR. |
d09e7a9
to
eda1167
Compare
@yuzefovich I have rebased the branch with the master with your int64 changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! One last nit from me, but the code
Reviewed 2 of 2 files at r6.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @Arun4rangan)
pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 173 at r6 (raw file):
// Pow arithmetic binary operator. for _, binOp := range []tree.BinaryOperator{tree.Pow} {
nit: I think with the change to int64
you can again simply include tree.Pow
in Other arithmetic binary operators
section along with tree.FloorDiv
and tree.Mod
since the output types are exactly the same for all three operations. Sorry about going back and forth on this.
This commit resolves: cockroachdb#49465 Previously, there was no support for Pow (^) in vectorized engine. This commit adds Pow operator. I added Pow to binaryOpMethods and registered inputs/outputs for this operator. Release note (sql change): Vectorized engine now support Pow operator.
eda1167
to
9db8298
Compare
Done! @yuzefovich thanks again for reviews! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
bors r=yuzefovich
Reviewed 1 of 1 files at r7.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale)
Build succeeded |
This commit resolves: #49465
Previously, there was no support for Pow (**) in vectorized engine.
This commit adds Pow operator. I added Pow to binaryOpMethods
and registered inputs/outputs for this operator.
Release note (sql change): Vectorized engine now support Pow operator.