Skip to content
This repository was archived by the owner on Nov 6, 2020. It is now read-only.

optimization of U256 #515

Merged
merged 24 commits into from
Feb 26, 2016
Merged

optimization of U256 #515

merged 24 commits into from
Feb 26, 2016

Conversation

NikVolf
Copy link
Contributor

@NikVolf NikVolf commented Feb 24, 2016

before

test u256_add ... bench:     139,774 ns/iter (+/- 8,128)
test u256_mul ... bench:   4,915,555 ns/iter (+/- 436,083)
test u256_sub ... bench:     757,326 ns/iter (+/- 47,169)

after

test u256_add ... bench:      36,160 ns/iter (+/- 2,217)
test u256_mul ... bench:     104,526 ns/iter (+/- 3,128)
test u256_sub ... bench:      35,494 ns/iter (+/- 1,163)

@NikVolf NikVolf added the A0-pleasereview 🤓 Pull request needs code review. label Feb 24, 2016
@debris
Copy link
Collaborator

debris commented Feb 25, 2016

nice, benchmarks are really impressive ^^ 👍

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 25, 2016

just good old asm :)

@arkpar
Copy link
Collaborator

arkpar commented Feb 25, 2016

What about U512? That's even more of a bottleneck currently.
I think we don't actually need multiplication of 512 -bit uints, We can get away with a custom function that multiplies two U256s into a U512.

fn u256_add(b: &mut Bencher) {
b.iter(|| {
let n = black_box(10000);
(0..n).fold(U256::from(1234599u64), |old, new| { old.overflowing_add(U256::from(new)).0 })
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

benchmark is not very accurate having a call to U256::from on each iteration.

@arkpar arkpar added A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 25, 2016
@arkpar
Copy link
Collaborator

arkpar commented Feb 25, 2016

Sub::sub is still not using the asm version?

@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 25, 2016

@arkpar sub uses overflowing_add
so it should become better as well

@arkpar
only u256 in this pr
u512 comes next

@arkpar
other issues resolved

@NikVolf NikVolf added A0-pleasereview 🤓 Pull request needs code review. and removed A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. labels Feb 25, 2016
@arkpar arkpar added A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 26, 2016
@debris
Copy link
Collaborator

debris commented Feb 26, 2016

You can also move assembly code to separate .s files and compile it with gcc. It would make optimization available also on rust beta/stable.

@arkpar
Copy link
Collaborator

arkpar commented Feb 26, 2016

@marek We want the calls to be inlined. Also less less problems with windows later

@NikVolf NikVolf added A0-pleasereview 🤓 Pull request needs code review. and removed A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. labels Feb 26, 2016
@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 26, 2016

removed warning (cause it's macros expansion, had to ignore it)

@arkpar arkpar added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 26, 2016
})
}

#[cfg(all(feature="x64asm", target_arch = "x86_64"))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space around =.

@gavofyork
Copy link
Contributor

minor style stuff. can't speak for the logic and i don't see any additional tests... coverage seems to have gone down slightly - is this spurious or an indication that additional code paths need testing?

@gavofyork gavofyork added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A8-looksgood 🦄 Pull request is reviewed well. labels Feb 26, 2016
@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 26, 2016

@gavofyork
yeah right
i have plenty of tests here: https://github.com/NikVolf/asm-fun/blob/master/src/lib.rs
will migrate it

@NikVolf NikVolf added A0-pleasereview 🤓 Pull request needs code review. and removed A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. labels Feb 26, 2016
@gavofyork gavofyork added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 26, 2016
@NikVolf
Copy link
Contributor Author

NikVolf commented Feb 26, 2016

@debris indeed, Arkady right, inlining gives 2x boost to performance for add/sub

gavofyork pushed a commit that referenced this pull request Feb 26, 2016
@gavofyork gavofyork merged commit a51ba5c into master Feb 26, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A8-looksgood 🦄 Pull request is reviewed well.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants