Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove for loop in parseSize to enable inlining #580

Merged
merged 1 commit into from
Oct 17, 2017

Conversation

wallyqs
Copy link
Member

@wallyqs wallyqs commented Sep 8, 2017

Using a goto based loop makes it become a leaf function which can be inlined, making us get a slight performance increase as this is in the fast path:

# before
go test ./... -gcflags="-m -m" -run TestParseSize -v 
...
server/util.go:24:6: cannot inline parseSize: unhandled op range

# after
go test ./... -gcflags="-m -m" -run TestParseSize -v
server/util.go:24:6: can inline parseSize as: func([]byte) int {...

Local benchmarks:

benchmark                           old ns/op     new ns/op     delta
BenchmarkParseSize-4                8.81          5.74          -34.85%
Benchmark_____Pub0b_Payload-4       92.9          85.6          -7.86%
Benchmark_____Pub8b_Payload-4       96.1          86.4          -10.09%
Benchmark____Pub32b_Payload-4       106           99.7          -5.94%
Benchmark___Pub128B_Payload-4       131           129           -1.53%
Benchmark___Pub256B_Payload-4       152           143           -5.92%
Benchmark_____Pub1K_Payload-4       348           329           -5.46%
Benchmark_____Pub4K_Payload-4       1474          1440          -2.31%
Benchmark_____Pub8K_Payload-4       3330          3305          -0.75%

benchmark                           old MB/s     new MB/s     speedup
BenchmarkParseSize-4                113.53       174.29       1.54x
Benchmark_____Pub0b_Payload-4       118.45       128.51       1.08x
Benchmark_____Pub8b_Payload-4       197.74       219.83       1.11x
Benchmark____Pub32b_Payload-4       413.47       441.53       1.07x
Benchmark___Pub128B_Payload-4       1072.63      1090.36      1.02x
Benchmark___Pub256B_Payload-4       1765.96      1878.76      1.06x
Benchmark_____Pub1K_Payload-4       2979.37      3150.57      1.06x
Benchmark_____Pub4K_Payload-4       2788.11      2852.61      1.02x
Benchmark_____Pub8K_Payload-4       2464.19      2482.35      1.01x
  • Link to issue, e.g. Resolves #NNN
  • Documentation added (if applicable)
  • Tests added
  • Branch rebased on top of current master (git pull --rebase origin master)
  • Changes squashed to a single commit (described here)
  • Build is green in Travis CI
  • You have certified that the contribution is your original work and that you license the work to the project under the MIT license

/cc @nats-io/core

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) to 91.921% when pulling d19586b on wallyqs:goto-parsesize into b58178d on nats-io:master.

@ghost
Copy link

ghost commented Sep 8, 2017

👍

@tylertreat
Copy link
Contributor

lgtm

Copy link
Member

@kozlovic kozlovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for a nitpick, lgtm.

server/util.go Outdated
var i, l int
var dec byte
l = len(d)
if l == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Could you replace var declarations above with this:

l := len(d)
if l == 0 {
  return -1
}
var (
  i int
  dec byte
)

(assuming that this does not affect performance)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated and confirmed we also get a small performance boost:

benchmark                           old ns/op     new ns/op     delta
BenchmarkParseSize-4                8.81          5.73          -34.96%
Benchmark_____Pub0b_Payload-4       92.9          85.0          -8.50%
Benchmark_____Pub8b_Payload-4       96.1          85.4          -11.13%
Benchmark____Pub32b_Payload-4       106           99.3          -6.32%
Benchmark___Pub128B_Payload-4       131           122           -6.87%
Benchmark___Pub256B_Payload-4       152           146           -3.95%
Benchmark_____Pub1K_Payload-4       348           332           -4.60%
Benchmark_____Pub4K_Payload-4       1474          1428          -3.12%
Benchmark_____Pub8K_Payload-4       3330          3257          -2.19%

benchmark                           old MB/s     new MB/s     speedup
BenchmarkParseSize-4                113.53       174.60       1.54x
Benchmark_____Pub0b_Payload-4       118.45       129.37       1.09x
Benchmark_____Pub8b_Payload-4       197.74       222.43       1.12x
Benchmark____Pub32b_Payload-4       413.47       442.96       1.07x
Benchmark___Pub128B_Payload-4       1072.63      1154.81      1.08x
Benchmark___Pub256B_Payload-4       1765.96      1835.92      1.04x
Benchmark_____Pub1K_Payload-4       2979.37      3117.37      1.05x
Benchmark_____Pub4K_Payload-4       2788.11      2877.11      1.03x
Benchmark_____Pub8K_Payload-4       2464.19      2519.17      1.02x

Using a goto based loop makes it become a leaf function which can be
inlined, making us get a slight performance increase in the fast path.
See: golang/go#14768
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 91.955% when pulling cd86c99 on wallyqs:goto-parsesize into 0c3d4ce on nats-io:master.

@petemiron petemiron self-requested a review September 20, 2017 19:03
Copy link
Contributor

@petemiron petemiron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kozlovic kozlovic merged commit a2d2327 into nats-io:master Oct 17, 2017
@wallyqs wallyqs deleted the goto-parsesize branch October 17, 2017 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants