-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EH] Fix pop enclosed within a block in DCE #6922
Conversation
WebAssembly#6400 fixed this case but that handled only when a `pop` is an immediate child of the current expression, but a `pop` can be nested deeper down. Fixes WebAssembly#6918.
src/passes/DeadCodeElimination.cpp
Outdated
@@ -110,7 +107,8 @@ struct DeadCodeElimination | |||
replaceCurrent(remainingChildren[0]); | |||
} else { | |||
replaceCurrent(builder.makeBlock(remainingChildren)); | |||
if (hasPop) { | |||
if (getModule()->features.hasExceptionHandling() && | |||
!FindAll<Pop>(curr).list.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little worried about quadratic behavior here. Imagine
(block (result i32)
(block (result i32)
..
(block (result i32)
(use (pop))
(unreachable)
)
..
)
)
I think we'd do FindAll
on each of those blocks as we turn them unreachable, in post-order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was worried about that too... Do you have other suggestions to fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the new EH (exnref) is more common, it would be nice if we are able to detect whether we are using the legacy EH and run this only for them...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, one option might be to just note if we ever see a Pop, and if we ever optimize a control flow structure. If so we can run EHUtils::handleBlockNestedPops
at the end.
We could also make that a little more precise by tracking if the Pop is actually in an optimized control flow structure, but I'm not sure how easy that would be, or if it would be worthwhile. Might be worth measuring.
About legacy EH, good point, we can also track if we ever see a legacy try-catch (or can a Pop only ever be there?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, one option might be to just note if we ever see a Pop, and if we ever optimize a control flow structure. If so we can run
EHUtils::handleBlockNestedPops
at the end.
Yeah I think we can make something like
Lines 477 to 484 in 9d3f8e5
// Whether this pass modifies Binaryen IR in ways that may require fixups for | |
// non-nullable locals to validate according to the wasm spec. If the pass | |
// adds locals not in that form, or moves code around in ways that might break | |
// that validation, this must return true. In that case the pass runner will | |
// automatically run the necessary fixups afterwards. | |
// | |
// For more details see the LocalStructuralDominance class. | |
virtual bool requiresNonNullableLocalFixups() { return true; } |
for EH fixups. (The way that non-local fixup is done conservative in way that unless we find a reason to not run the fixup, we run it) I actually tried this a while ago for the EH fixup but was kind of overwhelmed by the number of passes and dropped it, but if we make the criteria as simple as what you suggested, 1. Do we use
pop
2. Do we run control flow opts, it may not take that long time.
Also I think it is worthwhile to consider only try
-catch
(not catch_all
), given that pop
is only generated with catch
and catch
is a far less common that catch_all
. catch
is generated only when a user explicitly use try
-catch
clause in C++. I'm not sure about other languages but I guess it would be similar. But can this info be considered in a pass-level method like requiresNonNullableLocalFixups
?
We could also make that a little more precise by tracking if the Pop is actually in an optimized control flow structure, but I'm not sure how easy that would be, or if it would be worthwhile. Might be worth measuring.
How can this be done? Can this be done without quadratic time complexity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside of requiresNonNullableLocalFixups()
is that when it is true then the pass runner does extra work after the pass, every time. So even if the pass does nothing, or doesn't change a local, it still happens. Those fixups are fast enough that it isn't too bad, but in general it's better for rarely-needed fixups to be done in the pass. At least when there's a good way for the pass to detect when they are needed, that is more efficient I think.
How can this be done? Can this be done without quadratic time complexity?
Using ControlFlowWalker
we have a stack of control flow structures, and then when we visit a Pop we can mark all of them as containing a nested Pop. If the number of pops is O(1)
then this would be O(N)
overall?
But if there can be many Pops, maybe it isn't worth it, and we can just run the fixups if we've ever seen a Pop + ever optimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this be done? Can this be done without quadratic time complexity?
Using
ControlFlowWalker
we have a stack of control flow structures, and then when we visit a Pop we can mark all of them as containing a nested Pop. If the number of pops isO(1)
then this would beO(N)
overall?
Not sure if I understand. Do you mean we should create a ControlFlowWalker
for all control-flow changing operations (which some of them already use but not all do) and check the stack every time we make control flow changing modifications? Wouldn't it be O(height of control flow stack * # of expressions that need control flow changes)?
But if there can be many Pops, maybe it isn't worth it, and we can just run the fixups if we've ever seen a Pop + ever optimized.
Is there a convenient way to gather the list of control flow changing passes? Passes using ControlFlowWalker
can be a good indication but not all of them change control flow and also there are others that don't use it change control flow (like DCE).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we should create a ControlFlowWalker for all control-flow changing operations (
Sorry, no, I meant that the main class could be changed from PostWalker to ControlFlowWalker. So we would update the stack of control flow structures as we go, constantly.
Is there a convenient way to gather the list of control flow changing passes?
Hmm, I don't think so. Each pass would need to be investigated specifically.
But I don't think we need a broader change here? Just fixing this pass should be enough. Unless I'm missing your point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I thought we were talking about the whole binaryen-level solution... (which may be necessary at some point) Yeah, fixing this pass can be simple I think. I just made it to run the fixup once we see a pop
in a function. PTAL.
@@ -110,11 +115,6 @@ struct DeadCodeElimination | |||
replaceCurrent(remainingChildren[0]); | |||
} else { | |||
replaceCurrent(builder.makeBlock(remainingChildren)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only danger with a Pop is the new block made here, so I suggest that we set a boolean addedBlock
to true in this location. And we can rename needEHFixups
to hasPop
. Then below, the check if (needEHFixups)
can be if (hasPop && addedBlock)
, which will be a little more precise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
#6400 fixed this case but that handled only when a
pop
is an immediate child of the current expression, but apop
can be nested deeper down.We conservatively run the EH fixup whenever we have a
pop
and createblock
s in the optimization. We considered usingFindAll<Pop>
to make it precise, but we decided the quadratic time plexity was not worth it.Fixes #6918.