Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Game of GO benchmark #1169

Closed
ViralBShah opened this issue Aug 17, 2012 · 4 comments
Closed

Game of GO benchmark #1169

ViralBShah opened this issue Aug 17, 2012 · 4 comments
Labels
performance Must go faster

Comments

@ViralBShah
Copy link
Member

Issue to track the Game of GO benchmark

https://groups.google.com/d/topic/julia-dev/8uIjpx-YTKw/discussion
Code from https://gist.github.com/3373404

@ViralBShah
Copy link
Member Author

JeffBezanson added a commit that referenced this issue Nov 16, 2012
this is an optimization and also makes it easier to get callback pointers.
closes #938. sparse on Range 3x faster
helps #1211 (ziggurat), about 25% faster
helps #1169 (game of go), about 25% faster
helps #939 (sortperm), about 25% faster
helps #1163 (graph centrality) a bit, about 10% faster
@quinnj
Copy link
Member

quinnj commented Jun 19, 2013

So I played around with this last night incorporating the pending @inbounds macro and using the profiler. My modified code runs in about 9.2s, which is 1.7x gcc -O0 and 4.8x gcc -O3. It's also more than 2x faster than the original Julia code.
One interesting note was the apparent slowness of mod, as shown in the profiling results below:

635 ...pbox/go.jl; ...dditional_liberty; line: 123
191 ...pbox/go.jl; ...dditional_liberty; line: 124

Where lines 123 and 124 correspond to:

ai = 1 + mod(pos - 1, board.size)
aj = 1 + fld(pos - 1, board.size)

I guess I wouldn't expect mod to be that much slower than fld (I ran the profiler a few times just to check it wasn't a sampling thing).

Gist of my modified code: https://gist.github.com/karbarcca/5815251

@StefanKarpinski
Copy link
Member

It's deeply unfortunate that processors implement rem in hardware but not mod since mod is generally the better choice. This forces mod to be implemented in terms of rem and it ends up being significantly slower. Given the absurd excess of transistors modern CPUs have and the crazy number of instructions that the x86_64 architecture has, you would think they could add a frigging mod instruction.

@JeffBezanson
Copy link
Member

Much faster after #4042.

ViralBShah added a commit that referenced this issue Feb 3, 2025
)

Stdlib: LinearAlgebra
URL: https://github.com/JuliaLang/LinearAlgebra.jl.git
Stdlib branch: master
Julia branch: master
Old commit: 57e9a0d
New commit: c9ad828
Julia version: 1.12.0-DEV
LinearAlgebra version: 1.12.0
Bump invoked by: @ViralBShah
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaLang/LinearAlgebra.jl@57e9a0d...c9ad828

```
$ git log --oneline 57e9a0d..c9ad828
c9ad828 Fix #1164 - flaky posv test (#1166)
106da87 Merge branch 'master' into vs/1164
443aa0f update manifest from JuliaSyntaxHighlighting UUID change (#1188)
0ce073c update manifest from JuliaSyntaxHighlighting UUID change
e05561b Merge branch 'master' into vs/1164
55eddfc Fix structure test for strided matrices (#1184)
1a0135a Let `cond` of an empty square matrix return zero (#1169)
7542f75 Fix test
6f2c5df Fix structure test for strided matrices
697ee4f Fix #1164
```

Co-authored-by: ViralBShah <[email protected]>
Co-authored-by: Viral B. Shah <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

4 participants