-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor SqlToRel::sql_expr_to_logical_expr_internal
to reduce stack size
#12384
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the minimum needed to get the test passing, instead of refactoring each arm that doesn't already delegate to a function. Wouldn't be surprised if this crops up again, so have left a comment to serve as a reminder in the future.
Not sure of feasibility of doing something like #10023 to refactor this recursion to not be prone to issues like this in the future, but worth considering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -30,11 +30,9 @@ use datafusion_common_runtime::SpawnedTask; | |||
|
|||
const TEST_DIRECTORY: &str = "test_files/"; | |||
const PG_COMPAT_FILE_PREFIX: &str = "pg_compat_"; | |||
const STACK_SIZE: usize = 2 * 1024 * 1024 + 512 * 1024; // 2.5 MBs, the default 2 MBs is currently too small |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Co-authored-by: Andrew Lamb <[email protected]>
@@ -174,6 +174,10 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |||
schema: &DFSchema, | |||
planner_context: &mut PlannerContext, | |||
) -> Result<Expr> { | |||
// NOTE: This function is called recusively, so each match arm body should be as | |||
// small as possible to avoid stack overflows in debug builds. Follow the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh did it happen for debug builds only?
please clarify what do you mean small as possible
, small in terms of code, or small in terms of bytes which allocated for the stackframe? I feel we can put clarification into the comments to avoid potential confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both, I guess? I admit I'm not 100% of the mechanics at play here, I just eyeballed the larger arms which are likely to be causing this overflow issue from previous experience. I assume there's a general correlation between lines of code and bytes allocated for stackframe anyway 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote up the technical backstory for stack frame usage, many moons ago, here: #1047
I think this "stack overflow on debug builds" happens frequently enough it would be cool to write it up more formally on a blog or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @Jefffrey
Which issue does this PR close?
Closes #11499
Rationale for this change
Cause for stack overflow was familiar to #9962, so after some LLDB debugging was able to identify that
SqlToRel::sql_expr_to_logical_expr_internal
was being recursively called with a chunky frame size (appox 70000 bytes), and seemed to be the most frequent frame in the backtrace with the largest size. Applied fix similar to #9962 where large arms are split into functions to reduce the size of the main function.What changes are included in this PR?
Refactor
SqlToRel::sql_expr_to_logical_expr_internal
to split larger arms into separate functions.Are these changes tested?
Yes, existing sqllogictest (with stack size modifier removed so back to default 2MB)
Are there any user-facing changes?
No