-
Notifications
You must be signed in to change notification settings - Fork 36.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bench: add -quiet
and -iters=<n>
benchmark config args
#22999
Conversation
- use ALLOW_BOOL for -list arg instead of ALLOW_ANY - touch up `-asymptote=<n1,n2,n3...>` help - pack Args struct a bit more efficiently - handle args in alphabetical order
@martinus please let me know if this is ok, as the |
Strong Concept ACK on introducing an |
Thanks! The -quiet arg was actually my first motivation, to be able to less painfully compare results or share them, most of which for me have the verbose >5% err unstable warnings even when all the recommendations are followed, and it takes an annoying amount of time to edit out all the warnings by hand. People can see the %err, so for sharing results the noisy warnings are quite pollutive and it's so handy to be able to silence them. |
I don't think adding more code, and modifying a dependency, just to replicate functionality that already exists is a good idea. What's the problem with using
Couldn't you write a bash one-liner to post-process your benchmark output, that throws away lines that don't start with a |
@@ -2278,7 +2279,7 @@ struct IterationLogic::Impl { | |||
os << col.value(); | |||
} | |||
os << "| "; | |||
auto showUnstable = isWarningsEnabled() && rErrorMedian >= 0.05; | |||
auto showUnstable = isWarningsEnabled(mBench.m_quiet) && rErrorMedian >= 0.05; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that warning should ever be hidden. It shows that the value you are seeing is not reliable because the error is too high. If you regularly see that warning in a benchmark, the benchmark should be improved or the machine should be somehow stabilized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unfortunately many to most of the benchmarks have the warning for me, despite taking the steps to stabilize the machine. Encountered this extensively while working on #22284 and every time I run benchmarks to test pulls. I now use NANOBENCH_SUPPRESS_WARNINGS
to be able to share my results without running a bench dozens of times to have a few without the warnings.
@@ -624,6 +624,9 @@ class Bench { | |||
Bench& operator=(Bench const& other); | |||
~Bench() noexcept; | |||
|
|||
//! Whether to suppress warnings and recommendations. Equivalent to NANOBENCH_SUPPRESS_WARNINGS. | |||
bool m_quiet{false}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about adding support for this though, because this can already be done with the environment variable NANOBENCH_SUPPRESS_WARNINGS
and it is not obvious which of these setting should override the other. Maybe it would be better to just add documentation for NANOBENCH_SUPPRESS_WARNINGS
and also NANOBENCH_ENDLESS
into the usage documentation?
If we really want that, in nanobench all the configuration should be inside of Config
class. That way configuration can be reused in other benchmarks. Then Bench
should have a well documented getter/setter for this. I'd prefer to keep https://github.com/martinus/nanobench and this code here in sync though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
for (auto n : args.asymptote) { | ||
bench.complexityN(n); | ||
for (int i = 0; i < args.iters; ++i) { | ||
if (i == 0 && args.iters > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think adding an iterations argument that work like that is a good idea. nanobench
itself already has the ability to perform multiple measurements (called "epoch" in nanobench
), in fact each benchmark is already measured 11 times. That way it is able to show a measurement error. The number of iterations in each of the measurements is determined automatically, based on the computer's clock accuracy.
If you get unstable benchmark results, the first thing to do should be to make sure the computer is really stable: no frequency scaling, no turbo, no other interfering programs. nanobench
shows the warnings for good reason 🙂
If that doesn't help, make sure the actual benchmark itself is stable and actually always does the same (few randomness in it, better not have much allocations, threadding, locks, etc).
If that too doesn't help, you can e.g. increase the number of iterations with minEpochIterations. That's a bit problematic though because some benchmarks need a huge setting here, others a very low one. So generally it is probably better use minEpochTime and expose that setting in the arguments (probably as double value in seconds, e.g. like -minEpochTime=0.5
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I've gone through these things. In practice, what I'm seeing people doing to share and compare results in PR reviews is run a benchmark repeatedly, for which the iters=<n>
proposal here is a handy convenience.
I'm going to close and use these features for my own Bitcoin Core benchmarking. If anyone else would like to use them, they can cherry-pick this branch for their own use. Thanks for your replies, @martinus. Feel free to add bench documentation as you suggest in #22999 (comment) and I'll be happy to review it. |
e148a52 bench: fixed ubsan implicit conversion (Martin Ankerl) da4e2f1 bench: various args improvements (Jon Atack) d312fd9 bench: clean up includes (Jon Atack) 1f10f16 bench: add usage description and documentation (Martin Ankerl) d3c6f8b bench: introduce -min_time argument (Martin Ankerl) 9fef832 bench: make EvictionProtection.* work with any number of iterations (Martin Ankerl) 153e686 bench: change AddrManGood to AddrManAddThenGood (Martin Ankerl) 468b232 bench: remove unnecessary & incorrect multiplication in MuHashDiv (Martin Ankerl) eed99cf bench: update nanobench from 4.3.4 to 4.3.6 (Martin Ankerl) Pull request description: This PR updates the nanobench with the latest release from upstream, v4.3.6. It fixes the missing performance counters. Due to discussions on #22999 I have done some work that should make the benchmark results more reliable. It introduces a new flag `-min_time` that allows to run a benchmark for much longer then the default. When results are unreliable, choosing a large timeframe here should usually get repeatable results even when frequency scaling cannot be disabled. The default is now 10ms. For this to work I have changed the `AddrManGood` and `EvictionProtection` benchmarks so they work with any number of iterations. Also, this adds more usage documentation to `bench_bitcoin -h` and I've cherry-picked two changes from #22999 authored by Jon Atack ACKs for top commit: jonatack: re-ACK e148a52 laanwj: Code review ACK e148a52 Tree-SHA512: 2da6de19a5c85ac234b190025e195c727546166dbb75e3f9267e667a73677ba1e29b7765877418a42b1407b65df901e0130763936525e6f1450f18f08837c40c
Running a benchmark 10 times, twice (before and after), for #22974 and then editing the output by hand to remove the warnings and recommendations, brought home the point that it would be nice to be able to do that automatisch. This PR adds a
-quiet
arg to silence warnings and recommendations and aniters=<n>
arg to run each benchmark for the number of iterations passed.examples