Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test for Simplify redundant predicates #3915

Merged
merged 2 commits into from
Oct 22, 2022
Merged

Add test for Simplify redundant predicates #3915

merged 2 commits into from
Oct 22, 2022

Conversation

src255
Copy link
Contributor

@src255 src255 commented Oct 21, 2022

Write rules to simplify both a OR a and a AND a to a.

Hello! I wanted to help with issue #3895. This is my attempt, but I'm not sure if this is the desired comparison of the left and right fields of the BinaryExpr struct. I hope this helps and I would appreciate any feedback!

Which issue does this PR close?

Closes #3895 .

Write rules to simplify both `a OR a` and `a AND a` to a.
@github-actions github-actions bot added the optimizer Optimizer rules label Oct 21, 2022
@isidentical
Copy link
Contributor

Thank you for the PR @src255, I think it looks pretty good as is (an example comparison is available inside expr_contains). I think all you have to do next is add some tests (a very similiar example is test_simplify_optimized_plan). One thing to note is, I think we might have a AND a already so might make sense to check if the test fails first and gets fixed with your revision. But a OR a should be still good.

@src255
Copy link
Contributor Author

src255 commented Oct 21, 2022

Thanks for the feedback. Here is the new test I wrote with test_simplify_optimized_plan as a guide:

#[test]
fn test_simplify_optimized_plan_with_or() {
    let table_scan = test_table_scan();
    let plan = LogicalPlanBuilder::from(table_scan)
        .project(vec![col("a")])
        .unwrap()
        .filter(or(col("b").gt(lit(1)), col("b").gt(lit(1))))  // use `or` instead of `and`
        .unwrap()
        .build()
        .unwrap();

    assert_optimized_plan_eq(
        &plan,
        "\
        Filter: test.b > Int32(1)\
        \n  Projection: test.a\
        \n    TableScan: test",
    );
}

After running this test, I noticed that both test_simplify_optimized_plan_with_or and test_simplify_optimized_plan pass without my changes (OR and AND changes commented out)! Perhaps the optimizer is already making both simplifications:

  • a OR a --> a
  • a AND a --> a.

Also, I looked at the feedback from clippy and removed the needless references of left and right from the comparisons I wrote, but I'm not sure the changes I made are even necessary given the passing tests.

@isidentical
Copy link
Contributor

Thanks for checking it out. That seems to be true (the simplification below) 🤔 (which probably means somewhere in the #3859 we are missing something else, cc: @alamb @Ted-Jiang).

https://github.com/apache/arrow-datafusion/blob/e534c2536858cf18aac219c33b0bddef54c7f214/datafusion/optimizer/src/simplify_expressions.rs#L768-L779

Test simplification of `a OR a` --> `a` and remove unnecessary rules
introduced in the previous commit.
@alamb
Copy link
Contributor

alamb commented Oct 22, 2022

Thanks for the test and investigation @src255 and @isidentical -- clearly I got something wrong.

@alamb alamb merged commit 83b54f6 into apache:master Oct 22, 2022
@alamb alamb changed the title Simplify redundant predicates Add test for Simplify redundant predicates Oct 22, 2022
@src255 src255 deleted the issue_3895 branch October 28, 2022 00:09
Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this pull request Nov 5, 2022
* Simplify redundant predicates

Write rules to simplify both `a OR a` and `a AND a` to a.

* Add test for redudant `or` expression

Test simplification of `a OR a` --> `a` and remove unnecessary rules
introduced in the previous commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify a AND a and a OR a.
3 participants