-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor: Avoid cloning as many Ident
during SQL planning
#4534
Conversation
// Normalize an identifier to a lowercase string unless the identifier is quoted. | ||
pub(crate) fn normalize_ident(id: &Ident) -> String { | ||
match id.quote_style { | ||
Some(_) => id.value.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here the value is always cloned which is not necessary for most uses when we already have an owned string
d4f1c9c
to
895b4cd
Compare
@@ -446,7 +444,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |||
|
|||
for cte in with.cte_tables { | |||
// A `WITH` block can't use the same name more than once | |||
let cte_name = normalize_ident(&cte.alias.name); | |||
let cte_name = normalize_ident(cte.alias.name.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously normalize_indent always cloned. Now it only clones in a few places and most of the time can reuse the String in the sqlparser-ast directly
@@ -661,7 +659,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |||
.iter() | |||
.any(|x| x.option == ColumnOption::Null); | |||
fields.push(Field::new( | |||
&normalize_ident(&column.name), | |||
&normalize_ident(column.name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunate to simply drop the String
immediately, but Field::new
requires a &str
(it can't take the String). Filed upstream: https://github.com/apache/arrow-rs/pull/3288/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, Look like we can do improvement like above SubqueryAlias::try_new()
.
Ident
as much during planningIdent
during SQL planning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackwener I wonder if you have time or interest to review this PR?
datafusion/sql/src/planner.rs
Outdated
) | ||
} | ||
} | ||
|
||
fn apply_expr_alias(plan: LogicalPlan, idents: &Vec<Ident>) -> Result<LogicalPlan> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function took a reference but the caller immediately drops the actual Vec. This PR reuses it
pub fn try_new(plan: LogicalPlan, alias: &str) -> datafusion_common::Result<Self> { | ||
pub fn try_new( | ||
plan: LogicalPlan, | ||
alias: impl Into<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change allows SubqueryAlias::try_new
to take a String
if the caller has one or a &str
that will be copied into a new String if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice implementation👍, copy just when call &str
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I review them carefully, It make sense to me.
Thanks @alamb.
pub fn try_new(plan: LogicalPlan, alias: &str) -> datafusion_common::Result<Self> { | ||
pub fn try_new( | ||
plan: LogicalPlan, | ||
alias: impl Into<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice implementation👍, copy just when call &str
.
BTW, I think we can also avoid clone in some method in |
Thank you for the review @jackwener |
Benchmark runs are scheduled for baseline = 4ecf3e7 and contender = 2457ce4. 2457ce4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Draft as it builds on #4530Which issue does this PR close?
N/A
Rationale for this change
I noticed a bunch of redundant copying while working on #4530 but wanted to keep that PR smaller
What changes are included in this PR?
Remove redundant cloning
Are these changes tested?
covered by existing tests
Are there any user-facing changes?