Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose and document a simpler public API for simplify expressions #3719

Merged
merged 3 commits into from
Oct 7, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions datafusion/optimizer/src/simplify_expressions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -950,6 +950,24 @@ macro_rules! assert_contains {
};
}

/// Apply simplification and constant propagation to ([Expr]).
///
/// # Arguments
///
/// * `expr` - The logical expression
/// * `schema` - The DataFusion schema for the expr, used to resolve `Column` references
/// to qualified or unqualified fields by name.
/// * `props` - The Arrow schema for the input, used for determining expression data types
/// when performing type coercion.
pub fn simplify_expr(
expr: Expr,
schema: DFSchemaRef,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since SimplifyContext::new needs a Vec<&'Arc<DFSchema>, so I change the input schema to Arc<DFSchema> to avoid clone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense -- the SimplifyContext constructor is somewhat akward

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While working on #3741 I found having a signature of schema: &DFSchemaRef was actually more ergnonmic. What do you think?

We have to rerun the CI anyways give the github nonsense earlier today

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

props: &ExecutionProps,
) -> Result<Expr> {
let info = SimplifyContext::new(vec![&schema], props);
expr.simplify(&info)
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down Expand Up @@ -2553,4 +2571,56 @@ mod tests {

assert_optimized_plan_eq(&plan, expected);
}

#[test]
fn simplify_expr_for_constant_fold_test() {
let schema = DFSchema::new_with_metadata(
vec![DFField::new(None, "x", DataType::Int32, false)],
HashMap::new(),
)
.unwrap();

// x + (1 + 3)
let expr = Expr::BinaryExpr {
left: Box::new(col("x")),
op: Operator::Plus,
right: Box::new(Expr::BinaryExpr {
left: Box::new(lit(1)),
op: Operator::Plus,
right: Box::new(lit(3)),
}),
};

let props = ExecutionProps::new();
let simplifed_expr = simplify_expr(expr, Arc::new(schema), &props).unwrap();

// x + 4
let expected = Expr::BinaryExpr {
left: Box::new(col("x")),
op: Operator::Plus,
right: Box::new(lit(4)),
};
assert_eq!(simplifed_expr, expected);
}

#[test]
fn simplify_expr_for_rewrite_test() {
let schema = DFSchema::new_with_metadata(
vec![DFField::new(None, "x", DataType::Int32, false)],
HashMap::new(),
)
.unwrap();

// x * 1
let expr = Expr::BinaryExpr {
left: Box::new(col("x")),
op: Operator::Multiply,
right: Box::new(lit(1)),
};

let props = ExecutionProps::new();
let simplifed_expr = simplify_expr(expr, Arc::new(schema), &props).unwrap();

assert_eq!(simplifed_expr, col("x"));
}
}