Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: performance regression on April 21st #64214

Closed
erikgrinaker opened this issue Apr 26, 2021 · 5 comments · Fixed by #64225
Closed

opt: performance regression on April 21st #64214

erikgrinaker opened this issue Apr 26, 2021 · 5 comments · Fixed by #64225
Assignees
Labels
A-sql-optimizer SQL logical planning and optimizations. C-performance Perf of queries or internals. Solution not expected to change functional behavior.

Comments

@erikgrinaker
Copy link
Contributor

roachperf graphs are showing a performance regression across the board on April 21st, e.g.:

Screenshot 2021-04-26 at 17 09 15

Verified this to be caused by #63807, benchmarks of 22e4d64 (old) vs df67aa9 (new):

name                           old ops/sec  new ops/sec  delta
kv95/enc=false/nodes=1/cpu=32   73.1k ± 3%   61.1k ± 3%  -16.30%  (p=0.000 n=9+8)

name                           old p50      new p50      delta
kv95/enc=false/nodes=1/cpu=32    0.70 ± 0%    0.70 ± 0%     ~     (all equal)

name                           old p95      new p95      delta
kv95/enc=false/nodes=1/cpu=32    1.80 ± 0%    2.91 ± 6%  +61.81%  (p=0.000 n=8+8)

name                           old p99      new p99      delta
kv95/enc=false/nodes=1/cpu=32    4.70 ± 0%    6.82 ± 4%  +45.21%  (p=0.000 n=8+8)

Notice how the new April 21st graph shows a clear improvement from the start, but then slows down after about a minute to a level significantly below the old April 19th baseline. Statistics collection would be a likely cause for the abrupt change.

Screenshot 2021-04-26 at 17 05 14

Screenshot 2021-04-26 at 17 05 24

/cc @cockroachdb/test-eng

@erikgrinaker erikgrinaker added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-optimizer SQL logical planning and optimizations. labels Apr 26, 2021
@RaduBerinde
Copy link
Member

Wow.. thanks for tracking this down! I think we are missing special logic when the memo is determined to be "stale".

@RaduBerinde
Copy link
Member

@erikgrinaker for how long did you run the benchmarks? It must have been more than a minute if it slows down after that?

@tbg
Copy link
Member

tbg commented Apr 26, 2021

roachtest usually runs 10 minute intervals, which is also what you can see in the above graph

@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Apr 26, 2021

Just did roachtest run --count 10 '^kv95/enc=false/nodes=1/cpu=32$', they're 10-minute runs plus possibly some ramp-up time.

@RaduBerinde
Copy link
Member

Thanks. I understand the issue now. The problem is that we are using the statistics that are derived using placeholders, which are bogus. Once we have stats, our estimated row count is very large, and that affects the way the query is being executed.

planning time: 259µs
execution time: 4ms
distribution: local
vectorized: true
rows read from KV: 1 (0 B)
cumulative time spent in KV: 679µs
maximum memory usage: 90 KiB
network usage: 0 B (0 messages)

• scan
  columns: (k int, v bytes)
  cluster nodes: n1
  actual row count: 1
  vectorized batch count: 1
  KV rows read: 1
  KV bytes read: 0 B
  estimated row count: 20,653
  table: kv@primary
  spans: /-3812730450502136779/0

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Apr 26, 2021
This change is best explained by this comment:

```
// We are dealing with a memo that still contains placeholders. The statistics
// for such a memo can be wildly overestimated. Even if our plan is correct,
// the estimated row count for a scan is passed to the execution engine which
// uses it to make sizing decisions. Passing a very high count can affect
// performance significantly (see cockroachdb#64214). So we only use the fast path if the
// estimated row count is small; typically this will happen when we constrain
// columns that form a key and we know there will be at most one row.
```

Fixes cockroachdb#64214.

Release note (bug fix): fixed a performance regression for very simple
queries.
craig bot pushed a commit that referenced this issue Apr 27, 2021
64225: opt: make placeholder fast path conditional on the estimated row count r=RaduBerinde a=RaduBerinde

#### opt: show statistics in placeholder-fast-path tests

This highlights a problem with the fast path: we use statistics that
are derived using placeholders, which are usually terrible.

Release note: None

#### opt: mark columns as constant for equality with placeholder

This commit improves the FD generation for Select when we have filters
like `x = $1`. Because these filters have placeholders, they do not
generate constraints (which is the normal mechanism for detecting
constant columns).

This improves the cardinality property and row count estimate. The
estimate will be used for the placeholder fast path.

Release note: None

#### opt: make placeholder fast path conditional on the estimated row count

This change is best explained by this comment:

```
// We are dealing with a memo that still contains placeholders. The statistics
// for such a memo can be wildly overestimated. Even if our plan is correct,
// the estimated row count for a scan is passed to the execution engine which
// uses it to make sizing decisions. Passing a very high count can affect
// performance significantly (see #64214). So we only use the fast path if the
// estimated row count is small; typically this will happen when we constrain
// columns that form a key and we know there will be at most one row.
```

Fixes #64214.

Release note (bug fix): fixed a performance regression for very simple
queries.

64229: sql: add CREATE STATISTICS regression test r=mgartner a=mgartner

This is a "forward-port" of regression tests from #64226. This commit
contains no code changes because the bug was already been fixed on
master by #59687.

Release note: None

64235: roachtest: update version map and fixtures r=darinpp a=darinpp

This commit adds the recently released 20.1.15
to the version map in PredecessorVersion.

Release note: None (testing change)

Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Darin Peshev <[email protected]>
@craig craig bot closed this as completed in 976118b Apr 27, 2021
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-optimizer SQL logical planning and optimizations. C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants