optimization of U256 #515

NikVolf · 2016-02-24T23:12:56Z

before

test u256_add ... bench:     139,774 ns/iter (+/- 8,128)
test u256_mul ... bench:   4,915,555 ns/iter (+/- 436,083)
test u256_sub ... bench:     757,326 ns/iter (+/- 47,169)

after

test u256_add ... bench:      36,160 ns/iter (+/- 2,217)
test u256_mul ... bench:     104,526 ns/iter (+/- 3,128)
test u256_sub ... bench:      35,494 ns/iter (+/- 1,163)

debris · 2016-02-25T00:32:45Z

nice, benchmarks are really impressive ^^ 👍

NikVolf · 2016-02-25T00:37:50Z

just good old asm :)

arkpar · 2016-02-25T09:08:10Z

What about U512? That's even more of a bottleneck currently.
I think we don't actually need multiplication of 512 -bit uints, We can get away with a custom function that multiplies two U256s into a U512.

arkpar · 2016-02-25T09:10:10Z

util/benches/bigint.rs

+fn u256_add(b: &mut Bencher) {
+	b.iter(|| {
+		let n = black_box(10000);
+		(0..n).fold(U256::from(1234599u64), |old, new| { old.overflowing_add(U256::from(new)).0 })


benchmark is not very accurate having a call to U256::from on each iteration.

arkpar · 2016-02-25T09:39:04Z

Sub::sub is still not using the asm version?

NikVolf · 2016-02-25T13:58:14Z

@arkpar sub uses overflowing_add
so it should become better as well

@arkpar
only u256 in this pr
u512 comes next

@arkpar
other issues resolved

debris · 2016-02-26T09:05:05Z

You can also move assembly code to separate .s files and compile it with gcc. It would make optimization available also on rust beta/stable.

arkpar · 2016-02-26T10:20:15Z

@marek We want the calls to be inlined. Also less less problems with windows later

NikVolf · 2016-02-26T11:52:07Z

removed warning (cause it's macros expansion, had to ignore it)

gavofyork · 2016-02-26T12:36:56Z

util/src/uint.rs

+	})
+}
+
+#[cfg(all(feature="x64asm", target_arch = "x86_64"))]


space around =.

gavofyork · 2016-02-26T12:40:53Z

minor style stuff. can't speak for the logic and i don't see any additional tests... coverage seems to have gone down slightly - is this spurious or an indication that additional code paths need testing?

NikVolf · 2016-02-26T12:43:26Z

@gavofyork
yeah right
i have plenty of tests here: https://github.com/NikVolf/asm-fun/blob/master/src/lib.rs
will migrate it

NikVolf · 2016-02-26T15:47:17Z

@debris indeed, Arkady right, inlining gives 2x boost to performance for add/sub

optimization of U256

NikVolf added 6 commits February 24, 2016 21:17

u256 to inline assembly opt

dd8652d

r m/r + setc/xor

476bb85

sub x64 optimize

7821505

mul, bench showtime

ccaa194

fix naughty macros

0794049

Merge branch 'master' into bigint-opt

370d901

NikVolf added the A0-pleasereview 🤓 Pull request needs code review. label Feb 24, 2016

NikVolf added 2 commits February 25, 2016 03:09

inline

da69ea5

inline test

ae76a50

NikVolf mentioned this pull request Feb 25, 2016

Optimise bigint arithmetic #490

Closed

arkpar reviewed Feb 25, 2016
View reviewed changes

arkpar added A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 25, 2016

NikVolf added 3 commits February 25, 2016 16:20

fixed mul, fixed register pref

f17d893

fix bench iter

5467b06

specific feature for asm opt

fb5779a

NikVolf added A0-pleasereview 🤓 Pull request needs code review. and removed A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. labels Feb 25, 2016

NikVolf added 4 commits February 25, 2016 17:59

removed artefact cls/pushf/popf

7525ff2

overflowing_sub in sub

864e754

counter jump better

5d22ad3

mistake of ne/jcxz

2ee4a0c

arkpar added A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 26, 2016

allow dead code for macros expansion

f29417e

NikVolf added A0-pleasereview 🤓 Pull request needs code review. and removed A4-gotissues 💥 Pull request is reviewed and has significant issues which must be addressed. labels Feb 26, 2016

arkpar added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 26, 2016

gavofyork reviewed Feb 26, 2016
View reviewed changes

util/src/uint.rs

})

}

#[cfg(all(feature="x64asm", target_arch = "x86_64"))]

Copy link

Contributor

gavofyork Feb 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space around =.

gavofyork added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A8-looksgood 🦄 Pull request is reviewed well. labels Feb 26, 2016

NikVolf added 5 commits February 26, 2016 15:56

[ci skip] style fixes, multipart add test

e95538f

[ci skip] multipart sub test

228e3fe

[ci skip] mul multipart tests

3858a20

mul overflow multipart test

023c623

naughty overflow bug fixed

5013c4d

NikVolf added A0-pleasereview 🤓 Pull request needs code review. and removed A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. labels Feb 26, 2016

gavofyork added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Feb 26, 2016

gavofyork pushed a commit that referenced this pull request Feb 26, 2016

Merge pull request #515 from ethcore/bigint-opt

a51ba5c

optimization of U256

gavofyork merged commit a51ba5c into master Feb 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimization of U256 #515

optimization of U256 #515

NikVolf commented Feb 24, 2016

debris commented Feb 25, 2016

NikVolf commented Feb 25, 2016

arkpar commented Feb 25, 2016

arkpar Feb 25, 2016

arkpar commented Feb 25, 2016

NikVolf commented Feb 25, 2016

debris commented Feb 26, 2016

arkpar commented Feb 26, 2016

NikVolf commented Feb 26, 2016

gavofyork Feb 26, 2016

gavofyork commented Feb 26, 2016

NikVolf commented Feb 26, 2016

NikVolf commented Feb 26, 2016

optimization of U256 #515

optimization of U256 #515

Conversation

NikVolf commented Feb 24, 2016

debris commented Feb 25, 2016

NikVolf commented Feb 25, 2016

arkpar commented Feb 25, 2016

arkpar Feb 25, 2016

Choose a reason for hiding this comment

arkpar commented Feb 25, 2016

NikVolf commented Feb 25, 2016

debris commented Feb 26, 2016

arkpar commented Feb 26, 2016

NikVolf commented Feb 26, 2016

gavofyork Feb 26, 2016

Choose a reason for hiding this comment

gavofyork commented Feb 26, 2016

NikVolf commented Feb 26, 2016

NikVolf commented Feb 26, 2016