-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: from_plan shouldn't use original schema #6595
Conversation
LogicalPlan::Projection(Projection { schema, .. }) => { | ||
Ok(LogicalPlan::Projection(Projection::try_new_with_schema( | ||
expr.to_vec(), | ||
Arc::new(inputs[0].clone()), | ||
schema.clone(), | ||
)?)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here exist a bug, we shouldn't use original schema
, because it can be changed.
Original code hidden some BUG.
bd31450
to
0276537
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jackwener -- this PR is definitely heading in the right direction. Fixing the projection schema is definitely exposing a bunch of bugs.
There are a few things that don't look right in the PR that I commented on.
Thanks again for trying to make things better. It is really apprecaited
@@ -384,8 +384,7 @@ impl DFSchema { | |||
let self_fields = self.fields().iter(); | |||
let other_fields = other.fields().iter(); | |||
self_fields.zip(other_fields).all(|(f1, f2)| { | |||
f1.qualifier() == f2.qualifier() | |||
&& f1.name() == f2.name() | |||
f1.qualified_name() == f2.qualified_name() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if it matters, but I believe that qualified_name
creates a new String
where the previous version avoids that allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About this, I have a new idea on weekend. we may need handle alias schema()
to specify the schema.
Because alias('t1.a')
, field
is qualifier: none, name: t1.a
, we hope field
is qualifier: t1, name: a
.
I will do it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related issue #6681
@@ -110,10 +111,10 @@ select array_position(['h', 'e', 'l', 'l', 'o'], 'l', 4), array_position([1, 2, | |||
4 5 2 | |||
|
|||
# array_positions scalar function | |||
query III | |||
query error DataFusion error: SQL error: ParserError\("Expected an SQL statement, found: caused"\) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these errors look not quite right -- I think there is something wrong with sqllogictest --complete
with multi-line errors 🤔
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jackwener this looks like a nice improvement to me
However, it does seem to introduce a regression in some of the array functions recently introduced by @izveigor . The #6596 (comment) comment suggests to me we need to support NULL in array, so it is a known issue but it might be good to get @izveigor 's opinion
@@ -512,15 +512,22 @@ async fn test_regex_expressions() -> Result<()> { | |||
|
|||
#[tokio::test] | |||
async fn test_cast_expressions() -> Result<()> { | |||
test_expression!("CAST('0' AS INT)", "0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move it to slt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a good follow on PR
Seems like the hope to was to move it to slt as part of #6210 but wasn't completed 🤔
I created PR about the 'nulls' problem: #6662. So I think it can solve the regression. |
I prepare to merge this PR in tomorrow unless there are other comments I will continue doing more job. |
* revert: from_plan keep same schema Project in apache#6595 * revert: from_plan keep same schema Agg/Window in apache#6820 * revert type coercion * add comment
* revert: from_plan keep same schema Project in apache#6595 * revert: from_plan keep same schema Agg/Window in apache#6820 * revert type coercion * add comment
Signed-off-by: jayzhan211 <[email protected]>
* revert array.slt that changed by #6595 Signed-off-by: jayzhan211 <[email protected]> * add test for to string Signed-off-by: jayzhan211 <[email protected]> * first draft Signed-off-by: jayzhan211 <[email protected]> * cleanup Signed-off-by: jayzhan211 <[email protected]> --------- Signed-off-by: jayzhan211 <[email protected]>
Which issue does this PR close?
part of #6596.
Closes #6613
Rationale for this change
What changes are included in this PR?
In original code,
project
callfrom_plan
, it will keep original schema even if expression are different.It's wrong! Because different expression will have different schema. So I correct it.
Are these changes tested?
Are there any user-facing changes?