Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: add support for Pow binary operator #50143

Merged

Conversation

sooryranga
Copy link
Contributor

This commit resolves: #49465

Previously, there was no support for Pow (**) in vectorized engine.
This commit adds Pow operator. I added Pow to binaryOpMethods
and registered inputs/outputs for this operator.

Release note (sql change): Vectorized engine now support Pow operator.

@blathers-crl
Copy link

blathers-crl bot commented Jun 12, 2020

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Jun 12, 2020
@blathers-crl blathers-crl bot requested a review from yuzefovich June 12, 2020 13:03
@sooryranga sooryranga marked this pull request as draft June 12, 2020 13:04
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I have some suggestions and some nits as well.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan)


pkg/sql/colexec/execgen/supported_bin_cmp_ops.go, line 25 at r1 (raw file):

	tree.Mult:         "Mult",
	tree.Div:          "Div",
	tree.Pow:          "Pow",

nit: we're trying to keep this list in the same order as binary operators are defined in tree/expr.go, so please put it between Mod and Concat.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 141 at r1 (raw file):

	// Simple arithmetic binary operators.
	for _, binOp := range []tree.BinaryOperator{tree.Plus, tree.Minus, tree.Mult, tree.Div, tree.Pow} {

nit: please add it in "Other arithmetic binary operators" section below. That section already has the same input/output types set up.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 325 at r1 (raw file):

		case tree.Mod:
			computeBinOp = fmt.Sprintf("%s(float64(%s), float64(%s))", binaryOpFloatMethod[binOp], leftElem, rightElem)
		case tree.Pow:

nit: since tree.Mod and tree.Pow cases have the same function, you could combine them, i.e.

case tree.Mod, tree.Pow:
  computeBinOp ...

pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 395 at r1 (raw file):

			`))

		case tree.Mult, tree.Pow:

I think we should introduce a separate case for tree.Pow here.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 442 at r1 (raw file):

				t = template.Must(template.New("").Parse(`
					{
						result := Float64({{.Left}}) ** Float64({{.Right}})

Hm, I'm suprised it compiles (does it?).

Please take a look at IntPow method of tree/eval.go, we should follow what that method is doing with one difference being that instead of creating new apd.Decimals using apd.New we can reuse already allocated temporary decimals from the overload helper. Take a look at tree.Div case for an example.


pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 129 at r1 (raw file):


query T
EXPLAIN (VEC) SELECT _float ** _float FROM many_types

We should also add a test or two that actually runs the query (i.e. without EXPLAIN (VEC)).

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch from 0de3124 to 5205978 Compare June 13, 2020 12:51
@blathers-crl
Copy link

blathers-crl bot commented Jun 13, 2020

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch 11 times, most recently from a4e996d to a62c02a Compare June 15, 2020 02:16
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan and @yuzefovich)


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 485 at r2 (raw file):

				boundType :=  "{{.boundType}}"
				if boundType == "Int64" {
					return resultInt

Note that this code snippet is not a standalone function, rather it is inlined within another function in a loop, take a look at proj_const_left_ops.eg.go file for an example of how generated code looks like. And because of this, we should not have return statements in the templated code block.

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch 2 times, most recently from b530e5d to 8270903 Compare June 16, 2020 01:25
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue with integer types that we need to address, but this is looking very good!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan and @yuzefovich)


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 180 at r3 (raw file):

	// Other arithmetic binary operators.
	for _, binOp := range []tree.BinaryOperator{tree.FloorDiv, tree.Mod, tree.Pow} {

As I commented below, we should be returning types.Int for integers of all possible widths, so my earlier comment was incorrect - we should add another loop below this one that populates the output types as follows:

float, float -> types.Float
decimal, decimal -> types.Decimal
some_width_integer, possibly_another_width_integer -> types.Int

pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 335 at r3 (raw file):

		}
		buf := strings.Builder{}

nit: no need for this newline as well as for the one below.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 483 at r3 (raw file):

					colexecerror.ExpectedError(tree.ErrIntOutOfRange)
				}
				boundType :=  "{{.boundType}}"

In our old, row-by-row engine, there is no additional overflow check after checking whether an error is returned by Int64() function above, so we should not introduce such check either. Here is some more info:

[email protected]:56750/defaultdb> create table t(i int2);
CREATE TABLE

Time: 2.162ms

[email protected]:56750/defaultdb> insert into t values (32767);
INSERT 1

Time: 3.379ms

[email protected]:56750/defaultdb> select i^2 from t;
   ?column?
--------------
  1073676289
(1 row)

Time: 494µs

[email protected]:56750/defaultdb> select pg_typeof(i^2) from t;
  pg_typeof
-------------
  bigint
(1 row)

Time: 564µs

As we can see, we always return bigint (which is basically int64).

So we need to write a template that operates on integers of all three possible width, yet always return int64. Then we can eliminate boundType and convType parameters from the map as well as this additional overflow check.

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch from 8270903 to ea21f88 Compare June 16, 2020 03:31
@sooryranga
Copy link
Contributor Author

sooryranga commented Jun 16, 2020

Thanks for the feedback, it's really helpful.

I have another question:

float, float -> types.Float
decimal, decimal -> types.Decimal
some_width_integer, possibly_another_width_integer -> types.Int

Wouldn't we have a case int ^ (-int) result in decimal result?
2^-2 = 0.25

So shouldn't we result in some_width_integer, possibly_another_width_integer -> types.Decimal ?

Also regarding intPow in row-by-row engine : isn't there an overflow error returned from intPow function (https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/sem/tree/eval.go#L5018-L5030)

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch from ea21f88 to f6ab386 Compare June 16, 2020 03:56
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have last several nits, but it is pretty much ready to go.

Wouldn't we have a case int ^ (-int) result in decimal result? 2^-2 = 0.25

Yes, you're right, in general. Postgres returns double precision for Pow operation (i.e. float64), but our row-by-row engine currently returns types.Int (i.e. INT8), and whenever we get a non-integer result, then we will get an error in tree/eval.go:5025 which we will "convert" into ErrIntOutOfRange. So in the vectorized engine we should return the error as well.

If it's something that you might interested looking into, you could fix this deviation from Postgres and update row-by-row engine, then we will be able to update the vectorized engine as well (currently we're trying to make both engines behave exactly the same, even if the old one behaves incorrectly). I'll file an issue if we don't have already.

Also regarding intPow in row-by-row engine : isn't there an overflow error returned from intPow function

Yes, there is, but the overflow "check" is done by Int64 method of apd.Decimal object - if an overflow occurs, then we will return ErrIntOutOfRange error as I described above. However, we do not return an error if the original type was INT2 and we got an integer larger than MaxInt16 because the result is INT8, so there is no overflow. I hope this makes sense.

Reviewed 1 of 3 files at r2, 1 of 1 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan)


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 191 at r4 (raw file):

	}

	// Pow arithmetic binary operators.

nit: s/operators/operator/.


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 197 at r4 (raw file):

		binOpOutputTypes[binOp][typePair{types.DecimalFamily, anyWidth, types.DecimalFamily, anyWidth}] = types.Decimal
		for _, intWidth := range supportedWidthsByCanonicalTypeFamily[types.IntFamily] {
			for _, intWidth2 := range supportedWidthsByCanonicalTypeFamily[types.IntFamily] {

nit: "intWidth2" is somewhat unfortunate name, maybe something like "otherIntWidth" would be better?


pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 137 at r4 (raw file):


query T rowsort
SELECT (_float^_float)::STRING FROM many_types

nit: why do we need ::STRING part? I don't think it is necessary.

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch from f6ab386 to beceb46 Compare June 16, 2020 12:53
@sooryranga
Copy link
Contributor Author

Thanks for the insight, I would love to take a look!

Yes, that does makes sense. So I left off only the error check we do with adp.Int64() for now.

Also I have been converting tests from query T (string type) to query I (Integer type): I ran into the following problem:
When I try the following, it works as expected.

query T rowsort
SELECT (_int2^_int2)::STRING FROM many_types WHERE _int2 < 10
----
4

However, if i don't convert the int to a string type I get the following error:

query I rowsort
SELECT _int2^_int2 FROM many_types WHERE _int2 < 10
----
4

internal error: unexpected error from the vectorized engine: interface conversion: coldata.Column is []int16, not []int64

Not sure if it's something to do with the logic_test verification problem or something to do with overloads_bin.go problem. I saw #38937 , I was hoping you could give me more insight into this issue.

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch from beceb46 to d09e7a9 Compare June 16, 2020 13:03
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the insight, I would love to take a look!

I asked internally about that issue, and we might want to keep INT ^ INT returning an INT as well for performance reasons. If we decide that the result should be DECIMAL, I'll open an issue and cc you on it.

Try this diff to resolve the internal error:

diff --git a/pkg/sql/colexec/execplan.go b/pkg/sql/colexec/execplan.go
index 37ad161b95..0b4571cc52 100644
--- a/pkg/sql/colexec/execplan.go
+++ b/pkg/sql/colexec/execplan.go
@@ -1844,15 +1844,16 @@ func planProjectionExpr(
        // actualOutputType tracks the logical type of the output column of the
        // projection operator. See the comment below for more details.
        actualOutputType := outputType
-       if outputType.Identical(types.Int) {
+       if outputType.Identical(types.Int) && projOp != tree.Pow {
                // Currently, SQL type system does not respect the width of integers
                // when figuring out the type of the output of a projection expression
                // (for example, INT2 + INT2 will be typed as INT8); however,
                // vectorized operators do respect the width when both operands have
-               // the same width. In order to go around this limitation, we explicitly
-               // check whether output type is INT8, and if so, we override the output
-               // physical types to be what the vectorized projection operators will
-               // actually output.
+               // the same width (the only exception to this rule currently is Pow
+               // binary operator). In order to go around this limitation, we
+               // explicitly check whether output type is INT8, and if so, we override
+               // the output physical types to be what the vectorized projection
+               // operators will actually output.
                //
                // Note that in mixed-width scenarios (i.e. INT2 + INT4) the vectorized
                // engine will output INT8, so no overriding is needed.

The explanation is that even though in registerBinOpOutputTypes we set the result of INT2 ^ INT2 to be INT8, we have the custom logic in execplan.go that expects that INT2 operating with another INT2 returns INT2 and plans a cast to adapt to the row engine (which expects INT8). I know this is quite confusing, so if it doesn't make sense, don't worry :)

Reviewed 2 of 2 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Arun4rangan)


pkg/sql/logictest/testdata/logic_test/vectorize_overloads, line 335 at r5 (raw file):


statement error integer out of range
SELECT (_int2^_int2)::STRING FROM many_types

nit: ::STRING is not necessary.

@yuzefovich
Copy link
Member

Actually, the problem with internal error: unexpected error from the vectorized engine: interface conversion: coldata.Column is []int16, not []int64 that you ran into prompted me to think more about this integer width type mismatch, so I opened a PR (#50300) which will unify the old and the new execution engines. Once that PR is merged, then you would rebase and everything should just work as is. Thanks for bringing this to my attention!

@sooryranga
Copy link
Contributor Author

sooryranga commented Jun 16, 2020

That's awesome! Glad I could be of helped. I will wait for your PR to be merged before moving forward with this PR.

@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch from d09e7a9 to eda1167 Compare June 18, 2020 13:51
@sooryranga sooryranga marked this pull request as ready for review June 18, 2020 13:52
@sooryranga
Copy link
Contributor Author

@yuzefovich I have rebased the branch with the master with your int64 changes.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One last nit from me, but the code :lgtm:

Reviewed 2 of 2 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @Arun4rangan)


pkg/sql/colexec/execgen/cmd/execgen/overloads_bin.go, line 173 at r6 (raw file):

	// Pow arithmetic binary operator.
	for _, binOp := range []tree.BinaryOperator{tree.Pow} {

nit: I think with the change to int64 you can again simply include tree.Pow in Other arithmetic binary operators section along with tree.FloorDiv and tree.Mod since the output types are exactly the same for all three operations. Sorry about going back and forth on this.

This commit resolves: cockroachdb#49465

Previously, there was no support for Pow (^) in vectorized engine.
This commit adds Pow operator. I added Pow to binaryOpMethods
and registered inputs/outputs for this operator.

Release note (sql change): Vectorized engine now support Pow operator.
@sooryranga sooryranga force-pushed the arun/add_support_for_pow_binary_operator branch from eda1167 to 9db8298 Compare June 18, 2020 14:30
@sooryranga
Copy link
Contributor Author

Done!

@yuzefovich thanks again for reviews!

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

bors r=yuzefovich

Reviewed 1 of 1 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@craig
Copy link
Contributor

craig bot commented Jun 18, 2020

Build succeeded

@craig craig bot merged commit 78f0548 into cockroachdb:master Jun 18, 2020
@sooryranga sooryranga deleted the arun/add_support_for_pow_binary_operator branch August 4, 2023 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

colexec: add support for Pow binary operator
3 participants