Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer should have option to skip failing rules #2909

Merged
merged 15 commits into from
Jul 14, 2022

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented Jul 14, 2022

Which issue does this PR close?

Closes #2908

Partial fix for #2907

Rationale for this change

I often see queries fail due to bugs in individual optimizer rules. I would rather just skip the buggy optimizations rather than have the query fail.

What changes are included in this PR?

Check result from each rule and skip or fail depending on the configuration.

Are there any user-facing changes?

No

@andygrove andygrove marked this pull request as draft July 14, 2022 14:33
@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels Jul 14, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 14, 2022

Codecov Report

Merging #2909 (95f451e) into master (034678b) will decrease coverage by 0.00%.
The diff coverage is 89.47%.

❗ Current head 95f451e differs from pull request most recent head 58b29e0. Consider uploading reports for the commit 58b29e0 to get more accurate results

@@            Coverage Diff             @@
##           master    #2909      +/-   ##
==========================================
- Coverage   85.30%   85.30%   -0.01%     
==========================================
  Files         274      274              
  Lines       49234    49267      +33     
==========================================
+ Hits        41999    42027      +28     
- Misses       7235     7240       +5     
Impacted Files Coverage Δ
datafusion/core/src/config.rs 92.50% <ø> (ø)
datafusion/optimizer/src/optimizer.rs 82.69% <88.88%> (+7.69%) ⬆️
datafusion/core/src/execution/context.rs 78.85% <100.00%> (+0.02%) ⬆️
datafusion/core/src/physical_plan/metrics/value.rs 86.93% <0.00%> (-0.51%) ⬇️
datafusion/expr/src/logical_plan/plan.rs 76.96% <0.00%> (-0.18%) ⬇️
datafusion/physical-expr/src/expressions/binary.rs 95.12% <0.00%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 034678b...58b29e0. Read the comment docs.

@andygrove andygrove marked this pull request as ready for review July 14, 2022 16:26
debug!("Optimized logical plan:\n{}\n", new_plan.display_indent());
}
Err(ref e) => {
if optimizer_config.skip_failing_rules {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change here - we now have the option to ignore optimization rules that fail

@andygrove
Copy link
Member Author

@jdye64 This is part 2 of the index out of bounds fix

@andygrove
Copy link
Member Author

@alamb Thanks for the review on #2900 ... this is the follow-on that leverages those new error checks and allows the optimizer to gracefully handle such failures

@github-actions github-actions bot added the core Core DataFusion crate label Jul 14, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that this improves the resilency in DataFusion to make plans even in the presence of bugs 👍

@@ -43,6 +43,10 @@ pub const OPT_COALESCE_BATCHES: &str = "datafusion.execution.coalesce_batches";
pub const OPT_COALESCE_TARGET_BATCH_SIZE: &str =
"datafusion.execution.coalesce_target_batch_size";

/// Configuration option "datafusion.optimizer.skip_failed_rules"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 for it being a config setting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I love this too! There are a few places I would want our product to fail in case of an error here but others I would not. so the configuration is great

@andygrove andygrove changed the title Optimizer should skip failing rules Optimizer should have option to skip failing rules Jul 14, 2022
@andygrove andygrove merged commit 8ad3df5 into apache:master Jul 14, 2022
@andygrove andygrove deleted the optimizer-error-handling branch July 14, 2022 20:04
@ursabot
Copy link

ursabot commented Jul 14, 2022

Benchmark runs are scheduled for baseline = 034678b and contender = 8ad3df5. 8ad3df5 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fail gracefully if optimization rule fails
5 participants