Remove for loop in parseSize to enable inlining #580

wallyqs · 2017-09-08T03:29:49Z

Using a goto based loop makes it become a leaf function which can be inlined, making us get a slight performance increase as this is in the fast path:

# before
go test ./... -gcflags="-m -m" -run TestParseSize -v 
...
server/util.go:24:6: cannot inline parseSize: unhandled op range

# after
go test ./... -gcflags="-m -m" -run TestParseSize -v
server/util.go:24:6: can inline parseSize as: func([]byte) int {...

Local benchmarks:

benchmark                           old ns/op     new ns/op     delta
BenchmarkParseSize-4                8.81          5.74          -34.85%
Benchmark_____Pub0b_Payload-4       92.9          85.6          -7.86%
Benchmark_____Pub8b_Payload-4       96.1          86.4          -10.09%
Benchmark____Pub32b_Payload-4       106           99.7          -5.94%
Benchmark___Pub128B_Payload-4       131           129           -1.53%
Benchmark___Pub256B_Payload-4       152           143           -5.92%
Benchmark_____Pub1K_Payload-4       348           329           -5.46%
Benchmark_____Pub4K_Payload-4       1474          1440          -2.31%
Benchmark_____Pub8K_Payload-4       3330          3305          -0.75%

benchmark                           old MB/s     new MB/s     speedup
BenchmarkParseSize-4                113.53       174.29       1.54x
Benchmark_____Pub0b_Payload-4       118.45       128.51       1.08x
Benchmark_____Pub8b_Payload-4       197.74       219.83       1.11x
Benchmark____Pub32b_Payload-4       413.47       441.53       1.07x
Benchmark___Pub128B_Payload-4       1072.63      1090.36      1.02x
Benchmark___Pub256B_Payload-4       1765.96      1878.76      1.06x
Benchmark_____Pub1K_Payload-4       2979.37      3150.57      1.06x
Benchmark_____Pub4K_Payload-4       2788.11      2852.61      1.02x
Benchmark_____Pub8K_Payload-4       2464.19      2482.35      1.01x

Link to issue, e.g. Resolves #NNN
Documentation added (if applicable)
Tests added
Branch rebased on top of current master (git pull --rebase origin master)
Changes squashed to a single commit (described here)
Build is green in Travis CI
You have certified that the contribution is your original work and that you license the work to the project under the MIT license

/cc @nats-io/core

coveralls · 2017-09-08T03:34:34Z

Coverage decreased (-0.02%) to 91.921% when pulling d19586b on wallyqs:goto-parsesize into b58178d on nats-io:master.

ghost · 2017-09-08T04:52:25Z

👍

tylertreat · 2017-09-08T13:59:03Z

lgtm

kozlovic

Except for a nitpick, lgtm.

kozlovic · 2017-09-08T14:20:30Z

server/util.go

+	var i, l int
+	var dec byte
+	l = len(d)
+	if l == 0 {


nitpick: Could you replace var declarations above with this:

l := len(d) if l == 0 { return -1 } var ( i int dec byte )

(assuming that this does not affect performance)

Thanks, updated and confirmed we also get a small performance boost:

benchmark old ns/op new ns/op delta BenchmarkParseSize-4 8.81 5.73 -34.96% Benchmark_____Pub0b_Payload-4 92.9 85.0 -8.50% Benchmark_____Pub8b_Payload-4 96.1 85.4 -11.13% Benchmark____Pub32b_Payload-4 106 99.3 -6.32% Benchmark___Pub128B_Payload-4 131 122 -6.87% Benchmark___Pub256B_Payload-4 152 146 -3.95% Benchmark_____Pub1K_Payload-4 348 332 -4.60% Benchmark_____Pub4K_Payload-4 1474 1428 -3.12% Benchmark_____Pub8K_Payload-4 3330 3257 -2.19% benchmark old MB/s new MB/s speedup BenchmarkParseSize-4 113.53 174.60 1.54x Benchmark_____Pub0b_Payload-4 118.45 129.37 1.09x Benchmark_____Pub8b_Payload-4 197.74 222.43 1.12x Benchmark____Pub32b_Payload-4 413.47 442.96 1.07x Benchmark___Pub128B_Payload-4 1072.63 1154.81 1.08x Benchmark___Pub256B_Payload-4 1765.96 1835.92 1.04x Benchmark_____Pub1K_Payload-4 2979.37 3117.37 1.05x Benchmark_____Pub4K_Payload-4 2788.11 2877.11 1.03x Benchmark_____Pub8K_Payload-4 2464.19 2519.17 1.02x

Using a goto based loop makes it become a leaf function which can be inlined, making us get a slight performance increase in the fast path. See: golang/go#14768

coveralls · 2017-09-08T17:25:30Z

Coverage decreased (-0.01%) to 91.955% when pulling cd86c99 on wallyqs:goto-parsesize into 0c3d4ce on nats-io:master.

petemiron

LGTM.

kozlovic approved these changes Sep 8, 2017

View reviewed changes

Remove for loop in parseSize to enable inlining

cd86c99

Using a goto based loop makes it become a leaf function which can be inlined, making us get a slight performance increase in the fast path. See: golang/go#14768

wallyqs force-pushed the goto-parsesize branch from d19586b to cd86c99 Compare September 8, 2017 17:17

petemiron self-requested a review September 20, 2017 19:03

petemiron approved these changes Sep 20, 2017

View reviewed changes

kozlovic merged commit a2d2327 into nats-io:master Oct 17, 2017

wallyqs deleted the goto-parsesize branch October 17, 2017 22:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove for loop in parseSize to enable inlining #580

Remove for loop in parseSize to enable inlining #580

wallyqs commented Sep 8, 2017 •

edited

Loading

coveralls commented Sep 8, 2017

ghost commented Sep 8, 2017

tylertreat commented Sep 8, 2017

kozlovic left a comment

kozlovic Sep 8, 2017

wallyqs Sep 8, 2017

coveralls commented Sep 8, 2017

petemiron left a comment

Remove for loop in parseSize to enable inlining #580

Remove for loop in parseSize to enable inlining #580

Conversation

wallyqs commented Sep 8, 2017 • edited Loading

coveralls commented Sep 8, 2017

ghost commented Sep 8, 2017

tylertreat commented Sep 8, 2017

kozlovic left a comment

Choose a reason for hiding this comment

kozlovic Sep 8, 2017

Choose a reason for hiding this comment

wallyqs Sep 8, 2017

Choose a reason for hiding this comment

coveralls commented Sep 8, 2017

petemiron left a comment

Choose a reason for hiding this comment

wallyqs commented Sep 8, 2017 •

edited

Loading