Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize IntRange::from_pat #77075

Conversation

ecstatic-morse
Copy link
Contributor

@ecstatic-morse ecstatic-morse commented Sep 22, 2020

Previously, this method called the more general pat_constructor function, which can return a variety of constructors, including IntRange. Then it threw away any non-IntRange variants. Specialize it so work is only done when it could result in an IntRange.

This code is relatively hot, at least for the unicode_normalization crate.

r? @ghost

Previously, this method called the more general `pat_constructor`
function, which can return other pattern variants besides `IntRange`.
Then it throws away any non-`IntRange` variants. Specialize it so work
is only done when it could result in an `IntRange`.
@ecstatic-morse
Copy link
Contributor Author

@bors try
@rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 22, 2020

⌛ Trying commit 7d8ed78 with merge ec36cf3068e7df6990075cd48e570ae7eee4031e...

@bors
Copy link
Contributor

bors commented Sep 22, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: ec36cf3068e7df6990075cd48e570ae7eee4031e (ec36cf3068e7df6990075cd48e570ae7eee4031e)

@rust-timer
Copy link
Collaborator

Queued ec36cf3068e7df6990075cd48e570ae7eee4031e with parent e0bc267, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (ec36cf3068e7df6990075cd48e570ae7eee4031e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 29, 2020
…m-pat, r=Mark-Simulacrum

Optimize `IntRange::from_pat`, then shrink `ParamEnv`

Resolves rust-lang#77058.

r? `@Mark-Simulacrum`
cc `@vandenheuvel`

Looking at the output of `perf report` for rust-lang#76244, the hot instructions seemed to be around the call to `pat_constructor` in `IntRange::from_pat`. I carried out an obvious optimization, but it actually made the instruction count higher (see rust-lang#77075). However, it seems to have mitigated whatever was causing the pipeline stalls, so when combined with rust-lang#76244, it's a net win.

As you can see below, the regression in rust-lang#76244 seems to have originated from something measured by `stalled-cycles-backend`. I'll try to collect some finer-grained stats to see if I can isolate it. I wish I had a better idea of what was going on here. I'd like to prevent the regression from reappearing in the future due to small changes in unrelated code.

<details>
<summary>Current `master`:</summary>

```
 Performance counter stats for 'cargo +baseline-stage1 check':

          2,275.67 msec task-clock:u              #    0.998 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            49,826      page-faults:u             #    0.022 M/sec
     5,117,221,678      cycles:u                  #    2.249 GHz
       299,655,943      stalled-cycles-frontend:u #    5.86% frontend cycles idle
     2,284,213,395      stalled-cycles-backend:u  #   44.64% backend cycles idle
     8,051,871,959      instructions:u            #    1.57  insn per cycle
                                                  #    0.28  stalled cycles per insn
     1,359,589,402      branches:u                #  597.447 M/sec
         7,359,347      branch-misses:u           #    0.54% of all branches

       2.281030026 seconds time elapsed

       2.108197000 seconds user
       0.164183000 seconds sys
```
</details>

<details>
<summary>Shrink `ParamEnv` without changing `IntRange::from_pat`:</summary>

```
 Performance counter stats for 'cargo +perf-stage1 check':

          2,751.79 msec task-clock:u              #    0.996 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            50,103      page-faults:u             #    0.018 M/sec
     6,260,590,019      cycles:u                  #    2.275 GHz
       317,355,920      stalled-cycles-frontend:u #    5.07% frontend cycles idle
     3,397,743,582      stalled-cycles-backend:u  #   54.27% backend cycles idle
     8,276,224,367      instructions:u            #    1.32  insn per cycle
                                                  #    0.41  stalled cycles per insn
     1,370,453,386      branches:u                #  498.023 M/sec
         7,281,031      branch-misses:u           #    0.53% of all branches

       2.763265838 seconds time elapsed

       2.544578000 seconds user
       0.204548000 seconds sys
```
</details>

<details>
<summary>Shrink `ParamEnv` and change `IntRange::from_pat`: </summary>

```
 Performance counter stats for 'cargo +perf-stage1 check':

          2,295.57 msec task-clock:u              #    0.996 CPUs utilized
                 0      context-switches:u        #    0.000 K/sec
                 0      cpu-migrations:u          #    0.000 K/sec
            49,959      page-faults:u             #    0.022 M/sec
     5,151,407,066      cycles:u                  #    2.244 GHz
       324,517,829      stalled-cycles-frontend:u #    6.30% frontend cycles idle
     2,301,671,001      stalled-cycles-backend:u  #   44.68% backend cycles idle
     8,130,868,329      instructions:u            #    1.58  insn per cycle
                                                  #    0.28  stalled cycles per insn
     1,356,618,512      branches:u                #  590.972 M/sec
         7,323,800      branch-misses:u           #    0.54% of all branches

       2.304509653 seconds time elapsed

       2.128090000 seconds user
       0.163909000 seconds sys
```
</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants