-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Bitand
, Neg
, lowbit tests, and lowbit optimizations
#714
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Did you really want all of those to be benchmarks? Perhaps the fenwick tree should be a benchmark while your tests should be in tests/passing/small
We do more testing on the tests, including generating snapshots for the small
folder
Bitand
, Neg
, and lowbit testsBitand
, Neg
, lowbit tests, and lowbit optimizations
Will merge after successful nightly! Being careful right now because it seems like eggcc might have crashed the nightly last night... |
;; PopcountIterations guarantees termination for non-zero values | ||
;; lowbit(0) is undefined behavior | ||
|
||
(rule ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This technically violates the weak linearity because we have not provided an equivalent value for other values of the then branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that if you are computing something else in your loop, you will not be able to extract without the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yihong is right- we need to provide a loop that doesn't have the state edge in it as an alternative. See the existing state edge passthrough file for an example of how it's done for if statements.
If you are confused why we have to do this, hopefully the paper clears it up! Good chance to read that section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, move this rule to that file and ruleset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed this issue, but I feel it makes more sense for the rule to stay in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main concern is the state edge passthrough rule
;; PopcountIterations guarantees termination for non-zero values | ||
;; lowbit(0) is undefined behavior | ||
|
||
(rule ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yihong is right- we need to provide a loop that doesn't have the state edge in it as an alternative. See the existing state edge passthrough file for an example of how it's done for if statements.
If you are confused why we have to do this, hopefully the paper clears it up! Good chance to read that section
;; PopcountIterations guarantees termination for non-zero values | ||
;; lowbit(0) is undefined behavior | ||
|
||
(rule ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, move this rule to that file and ruleset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Small note about TmpCtx, not blocking but would be nice. Waiting on nightly to merge this
(let newlpinputs (TupleRemoveAt lpinputs j)) | ||
(let newpred_outputs (TupleRemoveAt pred_outputs (+ j 1))) | ||
|
||
(let newlpctx (DummyLoopContext newlpinputs newpred_outputs pred_outputs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These days we've decided to use TmpCtx
instead, then make sure to delete
it at the end of the rule. It's safer in case you forget to include some important information in the dummy context
Nightly looks great! |
This PR adds two new operators
Bitand
andNeg
, some tests, and analysis and optimization in Egglog that rewrite the naive loop implementation of the lowbit function into the Hacker's Delight one-liner.This now outperforms llvm-O3-O3 by an order of magnitude on my machine because the closed form is an O(log n) speedup (n = 2 * 10^5).
The RVSDGs before and after the optimization:
data:image/s3,"s3://crabby-images/779cc/779cc6e04bde33318499f229438b0a45cae427aa" alt="lowbit_naive_br-rvsdg-conversion"
data:image/s3,"s3://crabby-images/a94c3/a94c358e927832e08d2dc184d39690b65c04aea4" alt="lowbit_naive_br-rvsdg-optimize"