-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize IntRange::from_pat
#77075
Optimize IntRange::from_pat
#77075
Conversation
Previously, this method called the more general `pat_constructor` function, which can return other pattern variants besides `IntRange`. Then it throws away any non-`IntRange` variants. Specialize it so work is only done when it could result in an `IntRange`.
@bors try |
Awaiting bors try build completion |
⌛ Trying commit 7d8ed78 with merge ec36cf3068e7df6990075cd48e570ae7eee4031e... |
☀️ Try build successful - checks-actions, checks-azure |
Queued ec36cf3068e7df6990075cd48e570ae7eee4031e with parent e0bc267, future comparison URL. |
Finished benchmarking try commit (ec36cf3068e7df6990075cd48e570ae7eee4031e): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
…m-pat, r=Mark-Simulacrum Optimize `IntRange::from_pat`, then shrink `ParamEnv` Resolves rust-lang#77058. r? `@Mark-Simulacrum` cc `@vandenheuvel` Looking at the output of `perf report` for rust-lang#76244, the hot instructions seemed to be around the call to `pat_constructor` in `IntRange::from_pat`. I carried out an obvious optimization, but it actually made the instruction count higher (see rust-lang#77075). However, it seems to have mitigated whatever was causing the pipeline stalls, so when combined with rust-lang#76244, it's a net win. As you can see below, the regression in rust-lang#76244 seems to have originated from something measured by `stalled-cycles-backend`. I'll try to collect some finer-grained stats to see if I can isolate it. I wish I had a better idea of what was going on here. I'd like to prevent the regression from reappearing in the future due to small changes in unrelated code. <details> <summary>Current `master`:</summary> ``` Performance counter stats for 'cargo +baseline-stage1 check': 2,275.67 msec task-clock:u # 0.998 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 49,826 page-faults:u # 0.022 M/sec 5,117,221,678 cycles:u # 2.249 GHz 299,655,943 stalled-cycles-frontend:u # 5.86% frontend cycles idle 2,284,213,395 stalled-cycles-backend:u # 44.64% backend cycles idle 8,051,871,959 instructions:u # 1.57 insn per cycle # 0.28 stalled cycles per insn 1,359,589,402 branches:u # 597.447 M/sec 7,359,347 branch-misses:u # 0.54% of all branches 2.281030026 seconds time elapsed 2.108197000 seconds user 0.164183000 seconds sys ``` </details> <details> <summary>Shrink `ParamEnv` without changing `IntRange::from_pat`:</summary> ``` Performance counter stats for 'cargo +perf-stage1 check': 2,751.79 msec task-clock:u # 0.996 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 50,103 page-faults:u # 0.018 M/sec 6,260,590,019 cycles:u # 2.275 GHz 317,355,920 stalled-cycles-frontend:u # 5.07% frontend cycles idle 3,397,743,582 stalled-cycles-backend:u # 54.27% backend cycles idle 8,276,224,367 instructions:u # 1.32 insn per cycle # 0.41 stalled cycles per insn 1,370,453,386 branches:u # 498.023 M/sec 7,281,031 branch-misses:u # 0.53% of all branches 2.763265838 seconds time elapsed 2.544578000 seconds user 0.204548000 seconds sys ``` </details> <details> <summary>Shrink `ParamEnv` and change `IntRange::from_pat`: </summary> ``` Performance counter stats for 'cargo +perf-stage1 check': 2,295.57 msec task-clock:u # 0.996 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 49,959 page-faults:u # 0.022 M/sec 5,151,407,066 cycles:u # 2.244 GHz 324,517,829 stalled-cycles-frontend:u # 6.30% frontend cycles idle 2,301,671,001 stalled-cycles-backend:u # 44.68% backend cycles idle 8,130,868,329 instructions:u # 1.58 insn per cycle # 0.28 stalled cycles per insn 1,356,618,512 branches:u # 590.972 M/sec 7,323,800 branch-misses:u # 0.54% of all branches 2.304509653 seconds time elapsed 2.128090000 seconds user 0.163909000 seconds sys ``` </details>
Previously, this method called the more general
pat_constructor
function, which can return a variety of constructors, includingIntRange
. Then it threw away any non-IntRange
variants. Specialize it so work is only done when it could result in anIntRange
.This code is relatively hot, at least for the
unicode_normalization
crate.r? @ghost