Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: overflow handling for aggregates #38775

Closed
jordanlewis opened this issue Jul 9, 2019 · 5 comments · Fixed by #38967
Closed

exec: overflow handling for aggregates #38775

jordanlewis opened this issue Jul 9, 2019 · 5 comments · Fixed by #38967
Assignees
Labels
A-sql-vec SQL vectorized engine C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Jul 9, 2019

Aggregates in vectorized currently don't check for overflows. This is incorrect and needs to be fixed.

@rafiss rafiss added A-sql-vec SQL vectorized engine C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Jul 11, 2019
@rafiss
Copy link
Collaborator

rafiss commented Jul 18, 2019

I'm wondering what we want to do with float overflow. Our current non-vectorized implementation already differs from Postgres.

Postgres:

rafiss@127:postgres> create table foo (b double precision primary key, c int);
CREATE TABLE

rafiss@127:postgres> insert into foo values(1.79769313486231570814527423737043567981e+308);
INSERT 0 1

rafiss@127:postgres> insert into foo values(1.79769313486231570814527423737043567981e+307);
INSERT 0 1

rafiss@127:postgres> select * from foo;
+-----------------------+--------+
| b                     | c      |
|-----------------------+--------|
| inf                   | <null> |
| 1.79769313486232e+307 | <null> |
+-----------------------+--------+
SELECT 2

rafiss@127:postgres> select avg(b) from foo;
2019-07-18 11:38:26.009 EDT [85826] ERROR:  value out of range: overflow
2019-07-18 11:38:26.009 EDT [85826] STATEMENT:  select avg(b) from foo
value out of range: overflow

CRDB 19.1

[email protected]:64470/defaultdb> create table foo (b float primary key);
CREATE TABLE

[email protected]:64561/defaultdb> insert into foo values(1.79769313486231570814527423737043567981e+308);
INSERT 1

[email protected]:64561/defaultdb> insert into foo values(1.79769313486231570814527423737043567981e+307);
INSERT 1

[email protected]:64561/defaultdb> select * from foo;
             b
+-------------------------+
  1.7976931348623158e+307
  1.7976931348623157e+308
(2 rows)

[email protected]:64561/defaultdb> select avg(b) from foo;
  avg
+------+
  +Inf
(1 row)

@jordanlewis
Copy link
Member Author

Are we promoting float to decimal in that latter case?

@rafiss
Copy link
Collaborator

rafiss commented Jul 18, 2019

No, we are not promoting in that case. (We also see the same behavior as that latter case in the vectorized engine as it is right now.)

@jordanlewis
Copy link
Member Author

I think this difference is probably acceptable at least for now - we already do it, plus it's pretty well-defined - it doesn't wrap around. YIL this is called "saturating" overflow behavior.

@maddyblue
Copy link
Contributor

Yes, I discovered our float difference while writing edge. I decided it's fine because the behavior is the same as you would get if you were adding two large floats together (infinity). Actually thinking this through more...there's a good argument that while sum can return infinity, avg shouldn't. Hmm. Well at least we have tests asserting the results now.

rafiss added a commit to rafiss/cockroach that referenced this issue Jul 19, 2019
The overflow checks are done as part of the code generation in
overloads.go. The checks are done inline, rather than calling the
functions in the arith package for performance reasons.

The checks are only done for integer math. float math is already
well-defined since overflow will result in +Inf and -Inf as necessary.

The operations that these checks are relevant for are the SUM_INT
aggregator and projection. In the future, AVG will also benefit from
these overflow checks.

This changes the error message produced by overflows in the
non-vectorized SUM_INT aggregator so that the messages are consistent.
This should be fine in terms of postgres-compatibility since SUM_INT is
unique to CRDB and eventually we will get rid of it anyway.

resolves cockroachdb#38775

Release note: None
rafiss added a commit to rafiss/cockroach that referenced this issue Jul 23, 2019
The overflow checks are done as part of the code generation in
overloads.go. The checks are done inline, rather than calling the
functions in the arith package for performance reasons.

The checks are only done for integer math. float math is already
well-defined since overflow will result in +Inf and -Inf as necessary.

The operations that these checks are relevant for are the SUM_INT
aggregator and projection. In the future, AVG will also benefit from
these overflow checks.

This changes the error message produced by overflows in the
non-vectorized SUM_INT aggregator so that the messages are consistent.
This should be fine in terms of postgres-compatibility since SUM_INT is
unique to CRDB and eventually we will get rid of it anyway.

resolves cockroachdb#38775

Release note: None
rafiss added a commit to rafiss/cockroach that referenced this issue Jul 23, 2019
The overflow checks are done as part of the code generation in
overloads.go. The checks are done inline, rather than calling the
functions in the arith package for performance reasons.

The checks are only done for integer math. float math is already
well-defined since overflow will result in +Inf and -Inf as necessary.

The operations that these checks are relevant for are the SUM_INT
aggregator and projection. In the future, AVG will also benefit from
these overflow checks.

This changes the error message produced by overflows in the
non-vectorized SUM_INT aggregator so that the messages are consistent.
This should be fine in terms of postgres-compatibility since SUM_INT is
unique to CRDB and eventually we will get rid of it anyway.

resolves cockroachdb#38775

Release note: None
craig bot pushed a commit that referenced this issue Jul 23, 2019
38967: exec: overflow handling for vectorized arithmetic r=rafiss a=rafiss

The overflow checks are done as part of the code generation in
overloads.go. The checks are done inline, rather than calling the
functions in the arith package for performance reasons.

The checks are only done for integer math. float math is already
well-defined since overflow will result in +Inf and -Inf as necessary.

The operations that these checks are relevant for are the SUM_INT
aggregator and projection. In the future, AVG will also benefit from
these overflow checks.

This changes the error message produced by overflows in the
non-vectorized SUM_INT aggregator so that the messages are consistent.
This should be fine in terms of postgres-compatibility since SUM_INT is
unique to CRDB and eventually we will get rid of it anyway.

resolves #38775

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
@craig craig bot closed this as completed in #38967 Jul 23, 2019
spaskob pushed a commit to spaskob/cockroach that referenced this issue Aug 1, 2019
The overflow checks are done as part of the code generation in
overloads.go. The checks are done inline, rather than calling the
functions in the arith package for performance reasons.

The checks are only done for integer math. float math is already
well-defined since overflow will result in +Inf and -Inf as necessary.

The operations that these checks are relevant for are the SUM_INT
aggregator and projection. In the future, AVG will also benefit from
these overflow checks.

This changes the error message produced by overflows in the
non-vectorized SUM_INT aggregator so that the messages are consistent.
This should be fine in terms of postgres-compatibility since SUM_INT is
unique to CRDB and eventually we will get rid of it anyway.

resolves cockroachdb#38775

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-vec SQL vectorized engine C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants