-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: unbounded max cardinality penalty makes number of index columns dominate scan cost #68556
Labels
A-sql-optimizer
SQL logical planning and optimizations.
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-sql-queries
SQL Queries Team
Comments
I think the solution might be something like normalizing the large cardinality row cost penalty by the diff --git a/pkg/sql/opt/xform/coster.go b/pkg/sql/opt/xform/coster.go
index 084d02337e..3ae0340e89 100644
--- a/pkg/sql/opt/xform/coster.go
+++ b/pkg/sql/opt/xform/coster.go
@@ -656,7 +656,8 @@ func (c *coster) computeScanCost(scan *memo.ScanExpr, required *physical.Require
// Add a penalty if the cardinality exceeds the row count estimate. Adding a
// few rows worth of cost helps prevent surprising plans for very small tables
// or for when stats are stale.
- rowCount += c.largeCardinalityRowCountPenalty(scan.Relational().Cardinality, rowCount)
+ rowCount += c.largeCardinalityRowCountPenalty(scan.Relational().Cardinality, rowCount) /
+ float64(perRowCost)
if required.LimitHint != 0 {
rowCount = math.Min(rowCount, required.LimitHint) This prevents the number of columns in the index from dominating the cost. However when I tried this there were quite a few plans in the tests that changed, so there's likely more to this. |
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Aug 10, 2021
This commit tweaks the application of the unbounded cardinality penalty in the coster to add it directly to the cost of scans and zigzag joins rather than to the row count. This helps prevent an issue in which the number of index columns could dominate the scan cost and result in suboptimal plans. Fixes cockroachdb#68556 Release note (bug fix): Fixed a regression in the optimizer's cost model that could cause it to choose suboptimal plans when choosing between two non-unique index scans with different numbers of columns per index.
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Aug 10, 2021
This commit tweaks the application of the unbounded cardinality penalty in the coster to add it directly to the cost of scans and zigzag joins rather than to the row count. This helps prevent an issue in which the number of index columns could dominate the scan cost and result in suboptimal plans. Fixes cockroachdb#68556 Release note (bug fix): Fixed a regression in the optimizer's cost model that could cause it to choose suboptimal plans when choosing between two non-unique index scans with different numbers of columns per index.
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Aug 11, 2021
This commit tweaks the application of the unbounded cardinality penalty in the coster to add it directly to the cost of scans and zigzag joins rather than to the row count. This helps prevent an issue in which the number of index columns could dominate the scan cost and result in suboptimal plans. Fixes cockroachdb#68556 Release note (bug fix): Fixed a regression in the optimizer's cost model that could cause it to choose suboptimal plans when choosing between two non-unique index scans with different numbers of columns per index.
craig bot
pushed a commit
that referenced
this issue
Aug 11, 2021
68676: opt: adjust cost of scan with unbounded cardinality to avoid bad plans r=rytaft a=rytaft This commit tweaks the application of the unbounded cardinality penalty in the coster to add it directly to the cost of scans and zigzag joins rather than to the row count. This helps prevent an issue in which the number of index columns could dominate the scan cost and result in suboptimal plans. Fixes #68556 Release note (bug fix): Fixed a regression in the optimizer's cost model that could cause it to choose suboptimal plans when choosing between two non-unique index scans with different numbers of columns per index. Co-authored-by: Rebecca Taft <[email protected]>
craig bot
pushed a commit
that referenced
this issue
Aug 11, 2021
68447: ui: fix awkward display of statements in transaction detail page r=Azhng a=Azhng Previously, we aggregate the statement texts in transaction detail page. However, this can be misleading in the case when a statement is executed multiple times in a transaction. Also, different transactions with same statements but different execution count and execution order are also incorrectly grouped into a single entry. This commit removes the aggregation when rendering the SQL Statement Box in the transaction detail page and correctly groups the different transactions into different entries in the Transaction Page. This commit also removes the statement statistics table in the transaction detail page since this can be misleading. Release note (ui change): Transaction Detail Page's SQL Box now shows all of transaction's statements in the order that they were executed. Transaction Detail Page also no longer display statement statistics. 68676: opt: adjust cost of scan with unbounded cardinality to avoid bad plans r=rytaft a=rytaft This commit tweaks the application of the unbounded cardinality penalty in the coster to add it directly to the cost of scans and zigzag joins rather than to the row count. This helps prevent an issue in which the number of index columns could dominate the scan cost and result in suboptimal plans. Fixes #68556 Release note (bug fix): Fixed a regression in the optimizer's cost model that could cause it to choose suboptimal plans when choosing between two non-unique index scans with different numbers of columns per index. Co-authored-by: Azhng <[email protected]> Co-authored-by: Rebecca Taft <[email protected]>
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Aug 16, 2021
This commit tweaks the application of the unbounded cardinality penalty in the coster to add it directly to the cost of scans and zigzag joins rather than to the row count. This helps prevent an issue in which the number of index columns could dominate the scan cost and result in suboptimal plans. Fixes cockroachdb#68556 Release note (bug fix): Fixed a regression in the optimizer's cost model that could cause it to choose suboptimal plans when choosing between two non-unique index scans with different numbers of columns per index.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-sql-optimizer
SQL logical planning and optimizations.
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-sql-queries
SQL Queries Team
In #66979 we introduced an
unboundedMaxCardinalityScanRowCountPenalty
to constrained scans that have an unbounded cardinality. This increased the likelihood that, with stale table statistics, the optimizer would choose a plan with a scan on a primary index or unique secondary index with a bounded number of potential rows scanned over a scan on a non-unique secondary index with an unbounded number of potential rows scanned.This change had a negative side-effect. With stale table statistics that make a query look extremely selective (e.g., filtering for a value greater than the maximum histogram upper-bound), the number of columns in an index dominates the cost of a scan. This causes the optimizer to favor a scan over a few columns over a scan with more columns, even if the former requires an addition filter and/or index join after the scan.
The test below shows that the index with fewer columns is chosen even though an extra select is needed. See the
optstepsweb
output too.The text was updated successfully, but these errors were encountered: