-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pow is bottleneck, use faster pow! #2
base: master
Are you sure you want to change the base?
Conversation
Since this algorithm doesn't seem to need to be precise, it works out pretty well:
Codecov Report
@@ Coverage Diff @@
## master #2 +/- ##
=========================================
+ Coverage 83.76% 86.4% +2.63%
=========================================
Files 3 3
Lines 117 125 +8
=========================================
+ Hits 98 108 +10
+ Misses 19 17 -2
Continue to review full report at Codecov.
|
Interestingly, the speed up is much less pronounced when using the full dataset (benchmarked on only a small portion)... Guess then the nearest neighbor becomes more expensive. |
Thanks for making this PR! I plan on doing some profiling to get a sense of how costly each component (including I took a look at the source of Another thing I'm wondering about is if the implementation of this function should reside in UMAP.jl itself. I haven't looked, but there might be a numerical methods-related package that is more appropriate (which we could then import from). |
|
Had a look at the R implementation of UMAP today, looks like they use the same approximate power here - enabled by a keyword arg. It seems acceptable to me to incorporate it in a similar way here. |
Since the algorithm doesn't seem to need much precision, it works out pretty well:
![image](https://user-images.githubusercontent.com/1010467/50910598-6cec7e00-142e-11e9-9fa8-cd2427ad9d3e.png)
It's 2.5x faster with
fast_pow
if the loss in precision isn't acceptable ( i think it's a pretty rough approximation), ironically removing
@fastmath
also speed things up, since then it uses llvm.pow, which seems to optimize better!