pow is bottleneck, use faster pow! #2

SimonDanisch · 2019-01-09T15:50:17Z

Since the algorithm doesn't seem to need much precision, it works out pretty well:

It's 2.5x faster with fast_pow
if the loss in precision isn't acceptable ( i think it's a pretty rough approximation), ironically removing @fastmath also speed things up, since then it uses llvm.pow, which seems to optimize better!

Since this algorithm doesn't seem to need to be precise, it works out pretty well:

codecov-io · 2019-01-09T15:55:10Z

Codecov Report

Merging #2 into master will increase coverage by 2.63%.
The diff coverage is 84.21%.

@@            Coverage Diff            @@
##           master      #2      +/-   ##
=========================================
+ Coverage   83.76%   86.4%   +2.63%     
=========================================
  Files           3       3              
  Lines         117     125       +8     
=========================================
+ Hits           98     108      +10     
+ Misses         19      17       -2

Impacted Files	Coverage Δ
src/umap_.jl	`85.59% <84.21%> (+2.86%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eb740f6...7f2bfa7. Read the comment docs.

SimonDanisch · 2019-01-09T17:27:13Z

Interestingly, the speed up is much less pronounced when using the full dataset (benchmarked on only a small portion)... Guess then the nearest neighbor becomes more expensive.

dillondaudert · 2019-01-10T00:42:32Z

Thanks for making this PR!

I plan on doing some profiling to get a sense of how costly each component (including pow) is overall. It'd be nice to get a comparison of the three options you mention (current implementation, without @fastmath, and with fast_pow. I also have some more general thoughts.

I took a look at the source of fast_pow and one thing that doesn't sit well with me is that it doesn't seem rigorously tested, in the sense that the performance and error characteristics aren't really known (and therefore impossible to reason about). It's true SGD has tolerance for noise but it seems risky, I'll have to think about the potential impact more.

Another thing I'm wondering about is if the implementation of this function should reside in UMAP.jl itself. I haven't looked, but there might be a numerical methods-related package that is more appropriate (which we could then import from).

dillondaudert · 2019-01-11T14:42:58Z

@fastmath has been removed (f61ad40), which shaved about 20 seconds off the MNIST example runtime when I tested it.

dillondaudert · 2019-01-17T17:22:42Z

Had a look at the R implementation of UMAP today, looks like they use the same approximate power here - enabled by a keyword arg. It seems acceptable to me to incorporate it in a similar way here.

pow is bottleneck, use faster pow!

7f2bfa7

Since this algorithm doesn't seem to need to be precise, it works out pretty well:

SimonDanisch mentioned this pull request May 4, 2020

UMAP performance #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pow is bottleneck, use faster pow! #2

pow is bottleneck, use faster pow! #2

SimonDanisch commented Jan 9, 2019

codecov-io commented Jan 9, 2019 •

edited

Loading

SimonDanisch commented Jan 9, 2019

dillondaudert commented Jan 10, 2019

dillondaudert commented Jan 11, 2019

dillondaudert commented Jan 17, 2019 •

edited

Loading

pow is bottleneck, use faster pow! #2

Are you sure you want to change the base?

pow is bottleneck, use faster pow! #2

Conversation

SimonDanisch commented Jan 9, 2019

codecov-io commented Jan 9, 2019 • edited Loading

Codecov Report

SimonDanisch commented Jan 9, 2019

dillondaudert commented Jan 10, 2019

dillondaudert commented Jan 11, 2019

dillondaudert commented Jan 17, 2019 • edited Loading

codecov-io commented Jan 9, 2019 •

edited

Loading

dillondaudert commented Jan 17, 2019 •

edited

Loading