-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add shared basic block library #18497
base: main
Are you sure you want to change the base?
Conversation
1c9e7cc
to
5bfeff6
Compare
7112872
to
b313a48
Compare
b313a48
to
8b20b0d
Compare
I think we should do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work, great to see even more shared code 🎉
Thanks for the review and comments. I've addressed all but one. |
I'll do that in a follow up PR 👍 |
@@ -134,6 +134,10 @@ private module Implementation implements CfgShared::InputSig<Location> { | |||
SuccessorType getAMatchingSuccessorType(Completion c) { result = c.getAMatchingSuccessorType() } | |||
|
|||
predicate isAbnormalExitType(SuccessorType t) { none() } | |||
|
|||
int idOfAstNode(AstNode node) { none() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that getJoinBlockPredecessor
isn't implemented in the Actions basic block library, so this instantiation is actually fine. And we could move Actions to the shared BB library in a future PR.
@@ -64,6 +80,7 @@ module Make<InputSig Input> { | |||
BasicBlock getAPredecessor(SuccessorType t) { result.getASuccessor(t) = this } | |||
|
|||
/** Gets the control flow node at a specific (zero-indexed) position in this basic block. */ | |||
cached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be defined inside the Cached
module, and then just referenced here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to reference them with the
BasicBlocks:: prefix
?
Yes, that works. I didn't think of that 🙈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it right now? bbIndex
and friends are in the cached module an getNode
is uncached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry; I was not clear. What I meant was something like
/**
* Holds if `bbStart` is the first node in a basic block and `cfn` is the
* `i`th node in the same basic block.
*/
private predicate bbIndex(Node bbStart, Node cfn, int i) =
shortestDistances(startsBB/1, intraBBSucc/2)(bbStart, cfn, i)
cached
Node getNode(BasicBlock bb, int pos) { bbIndex(bb.getFirstNode(), result, pos) }
and then replace the call bbIndex(this.getFirstNode(), result, pos)
to getNode(this, pos)
.
* arbitrary order. | ||
*/ | ||
cached | ||
JoinPredecessorBasicBlock getJoinBlockPredecessor(JoinBasicBlock jb, int i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be possible to reference them with the BasicBlocks::
prefix?
predicate immediatelyControls(BasicBlock succ, SuccessorType s) { | ||
succ = this.getASuccessor(s) and | ||
forall(BasicBlock pred | pred = succ.getAPredecessor() and pred != this | | ||
succ.dominates(pred) | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Java, this predicate is called dominatingEdge
and it includes bbIDominates(this, succ)
as an additional conjunct. I think that ought to be semantically identical, but it might offer a better join-order since immediate dominance joined with the successor relation yields a much stronger context for the forall
.
Also, in this general library, it's probably worth it to expand the qldoc a bit on the relationship between the concept of controls and dominance. In short: controls equals edge dominance. See also the qldoc for dominatingEdge
in Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the concept of dominating edge aka immediatelyControls
might very well be applicable beyond ConditionBasicBlock
s, so it should possibly be available as a top-level predicate between basic blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, this concept can be put into BasicBlocks.qll
as it's independent of .isCondition()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that ought to be semantically identical
That seems true to me.
it might offer a better join-order since immediate dominance joined with the successor relation yields a much stronger context for the
forall
.
How can I validate that? I tried looking at the evaluator logs for the predicate with and without that conjunct and didn't see any noticeable difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in this general library, it's probably worth it to expand the qldoc a bit on the relationship between the concept of controls and dominance.
I've written some stuff now for the immediatelyControls
predicate. Let me know if there's more I should add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can I validate that? I tried looking at the evaluator logs for the predicate with and without that conjunct and didn't see any noticeable difference
Look at the RA with tuple counts and compare the two versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, including the extra conjunct did indeed produce fewer tuples during the calculation 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swift changes LGTM. I'd like to see DCA runs at some point before this is merged.
468707e
to
62a459d
Compare
* The above implies that this block immediately dominates `succ`. But | ||
* "controls" is a stronger notion than "dominates". It is not the case | ||
* that any immediate successor that is immediately dominated by this block | ||
* is also immediately controlled by this block. To see why, consider this | ||
* example corresponding to an `if`-statement without an `else` block: | ||
* ``` | ||
* ... --> cond --[true]--> ... --> if stmt | ||
* \ / | ||
* ----[false]----------- | ||
* ``` | ||
* The basic block for `cond` immediately dominates the immediately | ||
* succeeding basic block for the `if` statement. But the `if` statement | ||
* is not immediately controlled by the `cond` basic block and the `false` | ||
* edge since it is also possible to reach the `if` statement via a path | ||
* through the `true` edge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this hard to read and also missing some main points (also, the placement of if stmt
in that graph is non-sensical to me). How about this:
* The above implies that this block immediately dominates `succ`. But | |
* "controls" is a stronger notion than "dominates". It is not the case | |
* that any immediate successor that is immediately dominated by this block | |
* is also immediately controlled by this block. To see why, consider this | |
* example corresponding to an `if`-statement without an `else` block: | |
* ``` | |
* ... --> cond --[true]--> ... --> if stmt | |
* \ / | |
* ----[false]----------- | |
* ``` | |
* The basic block for `cond` immediately dominates the immediately | |
* succeeding basic block for the `if` statement. But the `if` statement | |
* is not immediately controlled by the `cond` basic block and the `false` | |
* edge since it is also possible to reach the `if` statement via a path | |
* through the `true` edge. | |
* The concept of an edge `E` controlling a node `N` in a graph can also be | |
* described as _edge dominance_ in the sense that if `E` was split in two | |
* with an added node in the middle then "controlled by `E`" would be | |
* equivalent to dominance by that added node. | |
* Note that controls/edge-dominance is stronger than node dominance as | |
* it implies node dominance (by either endpoint), but the converse is not | |
* true, hence the need for this concept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that this explains the notion of "control" by introducing a synonym for it, "edge dominance" and explaining the synonym instead. Then afterwards we have to write "controls/edge-dominance" because there is now two names for the same thing.
Instead, I've now tried to spell out the analogy/relation between dominance and control but without introducing another term helps (it's not standard nor used elsewhere in the library).
-
the placement of if
stmt
in that graph is non-sensical to me
Sorry, it should have been if
expression, not statement. I've fixed the example and moved it inside the body as internal documentation.
Also, as the "immediately" in immediatelyControls
is very different from the "immediately" in immediatelyDominates
, I've moved the controls vs. dominates explanation to the controls
predicate which corresponds more nicely with the dominates
/strictlyDominates
predicates.
Swift DCA results look good but there's a 16.2% analysis time regression - which sounds like a lot, but we get a lot of wobble for metrics like this on the Swift DCA job (e.g. a 17.8% slowdown wobble a month ago). I did another run for another data point and got 13.4% slowdown (and a random failure), which is a little better but far from reassuring. The C# run is showing no such pattern. Can you say anything about the expected performance impact of this work? (I'm suspicious my concern amounts to nothing) |
The expectation is that performance should be the same or improved. Most of the code is identical to the previous code for Swift and in the cases where it is not it should be better performing. |
cached | ||
predicate bbIndex(Node bbStart, Node cfn, int i) = | ||
shortestDistances(startsBB/1, intraBBSucc/2)(bbStart, cfn, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cached | |
predicate bbIndex(Node bbStart, Node cfn, int i) = | |
shortestDistances(startsBB/1, intraBBSucc/2)(bbStart, cfn, i) | |
pragma[nomagic] | |
private predicate bbIndex(Node bbStart, Node cfn, int i) = | |
shortestDistances(startsBB/1, intraBBSucc/2)(bbStart, cfn, i) | |
cached Node getNode(BasicBlock bb, int pos) { | |
bbIndex(bb.getFirstNode(), result, pos) | |
} |
and then change the call in BasicBlock.getNode
from bbIndex(this.getFirstNode(), result, pos)
to result = getNode(this, pos)
* basic block and which strictly dominates `bb`. | ||
* | ||
* All basic blocks, except entry basic blocks, have a unique immediate | ||
* dominator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated a bunch of the existing qdocs related to immediate dominance. I think this is more precise/correct.
The previous explanations stated that the immediately dominating node would be an immediate predecessor, which I don't think is correct. See for instance this example from Wikipedia where 2 immediately dominates 5, but is not a direct predecessor. Evaluating the predicates also gives immediate dominators that are not direct predecessors.
Adds a shared basic block library in the
controlflow
pack and modifies the basic block implementation of Swift, Ruby, C#, and Rust to use it.A few notes:
There's two ways to use the basic block library. Either through the new
codeql.controlflow.BasicBlock
or through the existingcodeql.controlflow.Cfg
. The former is suitable for languages that don't use the control graph library and the latter for those that do. In this PR all languages use the latter, but I've tried instantiatingcodeql.controlflow.BasicBlock
for Go and that seems to work fine as well.The
BasicBlock
class for C# now has both the currentgetASuccessorByType/1
method and a newgetASuccessor/1
method that it inherits from the basic block library and which is the name used in Ruby, Rust, and Swift. We could consider deprecatinggetASuccessorByType/1
in order to not have two methods doing the same and to increase consistency between language libraries.For the
BasicBlock
subclasses inCfg.qll
I've changed the current names a bit, such that they are all consistently of the form${name}BasicBlock
. For instanceJoinBlock
is insteadJoinBasicBlock
. For the existing language-level basic block libraries I've kept the current names for backwards compatibility, so only Rust use the new names.