-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: div and rem by negative zero #960
Changes from 5 commits
52b0df3
a55c1e1
aed2072
184c22b
1318da8
77ca90a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -91,13 +91,15 @@ Comet has a plan stability testing framework that can be used to test the stabil | |
The plan stability testing framework is located in the `spark` module and can be run using the following command: | ||
|
||
```sh | ||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" test | ||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" -nsu test | ||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" -Pspark-3.5 -nsu test | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since you added these steps to the testing commands, does it makes sense to duplicate them below for the generation commands? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, updated. |
||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV1_4_PlanStabilitySuite" -Pspark-4.0 -nsu test | ||
``` | ||
|
||
and | ||
```sh | ||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" test | ||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" -nsu test | ||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" -Pspark-3.5 -nsu test | ||
./mvnw -pl spark -Dsuites="org.apache.spark.sql.comet.CometTPCDSV2_7_PlanStabilitySuite" -Pspark-4.0 -nsu test | ||
``` | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,7 +25,7 @@ import org.apache.spark.SparkConf | |
import org.apache.spark.internal.Logging | ||
import org.apache.spark.network.util.ByteUnit | ||
import org.apache.spark.sql.{SparkSession, SparkSessionExtensions} | ||
import org.apache.spark.sql.catalyst.expressions.{EqualNullSafe, EqualTo, Expression, GreaterThan, GreaterThanOrEqual, KnownFloatingPointNormalized, LessThan, LessThanOrEqual, NamedExpression, PlanExpression} | ||
import org.apache.spark.sql.catalyst.expressions.{Divide, DoubleLiteral, EqualNullSafe, EqualTo, Expression, FloatLiteral, GreaterThan, GreaterThanOrEqual, KnownFloatingPointNormalized, LessThan, LessThanOrEqual, NamedExpression, PlanExpression, Remainder} | ||
import org.apache.spark.sql.catalyst.expressions.aggregate.{Final, Partial} | ||
import org.apache.spark.sql.catalyst.optimizer.NormalizeNaNAndZero | ||
import org.apache.spark.sql.catalyst.rules.Rule | ||
|
@@ -873,12 +873,18 @@ class CometSparkSessionExtensions | |
LessThan(normalizeNaNAndZero(left), normalizeNaNAndZero(right)) | ||
case LessThanOrEqual(left, right) => | ||
LessThanOrEqual(normalizeNaNAndZero(left), normalizeNaNAndZero(right)) | ||
case Divide(left, right, evalMode) => | ||
Divide(left, normalizeNaNAndZero(right), evalMode) | ||
case Remainder(left, right, evalMode) => | ||
Remainder(left, normalizeNaNAndZero(right), evalMode) | ||
} | ||
} | ||
|
||
def normalizeNaNAndZero(expr: Expression): Expression = { | ||
expr match { | ||
case _: KnownFloatingPointNormalized => expr | ||
case FloatLiteral(f) if !f.equals(-0.0f) => expr | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just clarifying: this is saying that as long as the literal isn't -0.0, we don't have to normalize it? And it doesn't look like there's any way to define a literal NaN, so we don't handle that case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Exactly
We can create a literal NaN, but Spark |
||
case DoubleLiteral(d) if !d.equals(-0.0d) => expr | ||
case _ => | ||
expr.dataType match { | ||
case _: FloatType | _: DoubleType => | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2636,7 +2636,11 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde with CometExprShim | |
|
||
def nullIfWhenPrimitive(expression: Expression): Expression = if (isPrimitive(expression)) { | ||
val zero = Literal.default(expression.dataType) | ||
If(EqualTo(expression, zero), Literal.create(null, expression.dataType), expression) | ||
expression match { | ||
case _: Literal if expression != zero => expression | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this related? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indirectly related, this will avoid to put |
||
case _ => | ||
If(EqualTo(expression, zero), Literal.create(null, expression.dataType), expression) | ||
} | ||
} else { | ||
expression | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add
-nsu
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It helps for faster turnaround