Skip to content

Commit

Permalink
Indexing Engine CP model: Update rules/goals, add JSON examples
Browse files Browse the repository at this point in the history
  • Loading branch information
lfittl committed Dec 14, 2023
1 parent 25df806 commit 064e813
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 13 deletions.
30 changes: 29 additions & 1 deletion indexing-engine/cp-model/goals.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ backlink_href: /docs/indexing-engine/constraint-programming-model
backlink_title: 'Constraint Programming Model'
---

import CopyCodeBlock from '../../components/CopyCodeBlock'

Goals are the main components that guide the model toward the solution that best meets the requirements of the user. These requirements can be defined using any combination of goals.

Its important to note that the order of the goals always matters. Consider the case of minimizing the costs (Minimize Total Cost) and minimizing the number of indexes (Minimize Number of Indexes).
Expand All @@ -14,35 +16,61 @@ Further, some goals should not be selected as the first goals to optimize.

For example, suppose that the first goal is [Minimize Number of Indexes](#minimize-number-of-indexes). The model is essentially told that "use the fewest indexes" is the most important goal. Obviously, 0 indexes is the fewest indexes it can use. Subsequent goals will not be able to make use of any indexes because of this.

Goals that should be avoided as the first goal are [Minimize Number of Indexes](#minimize-number-of-indexes), [Minimize Index Write Overhead](#minimize-index-write-overhead), and [Minimize Update Overhead](#minimize-update-overhead), unless there is a [rule](rules) that is focused on the scan cost, like [Minimum Per-Scan Cost (Normal)](rules#minimum-per-scan-cost-normal).
Goals that should be avoided as the first goal are [Minimize Number of Indexes](#minimize-number-of-indexes), [Minimize Index Write Overhead](#minimize-index-write-overhead), and [Minimize Update Overhead](#minimize-update-overhead), unless there is a [rule](rules) that is focused on the scan cost, like [Maximum Per-Scan Cost Tolerance](rules#maximum-per-scan-cost-tolerance).


### Minimize Index Write Overhead

The *Minimize Index Write Overhead* goal strives to minimize the index write overhead associated with the selected indexes.

<CopyCodeBlock content={`{ "Name": "Minimize Index Write Overhead", "Tolerance": 0.0 }`} language="json" />

### Minimize Number of Indexes

The *Minimize Number of Indexes* goal strives to minimize the number of existing and possible indexes selected.

<CopyCodeBlock content={`{ "Name": "Minimize Number of Indexes", "Tolerance": 0.0 }`} language="json" />

### Minimize Total Cost

The *Minimize Total Cost* goal strives to minimize the combined costs of the scans.

<CopyCodeBlock content={`{ "Name": "Minimize Total Cost", "Tolerance": 0.1 }`} language="json" />

### Minimize Maximum Cost

The *Minimize Maximum Cost* goal strives to minimize the largest cost found among the scans.

<CopyCodeBlock content={`{ "Name": "Minimize Maximum Cost", "Tolerance": 1.0 }`} language="json" />

### Minimize Maximum Relative Cost

The *Minimize Maximum Relative Cost* goal strives to minimize the largest relative cost found among the scans. The *relative cost* of a scan is equal to its actual value divided by the best possible value it could get in theory.

<CopyCodeBlock content={`{ "Name": "Minimize Maximum Relative Cost", "Tolerance": 10.0 }`} language="json" />

### Minimize Total Impact

The *Minimize Total Impact* goal strives to minimize the combined impacts of the scans.

The impact of a scan is a measure of its influence on performance, and is equal to its cost multiplied by the frequency in which it appears in queries. Common scans with high costs will tend to have a higher impact on performance than uncommon scans with lower costs.

<CopyCodeBlock content={`{ "Name": "Minimize Total Impact", "Tolerance": 0.1 }`} language="json" />

### Minimize Maximum Impact

The *Minimize Maximum Impact* goal strives to minimize the largest impact found among the scans.

<CopyCodeBlock content={`{ "Name": "Minimize Maximum Impact", "Tolerance": 1.0 }`} language="json" />

### Minimize Maximum Relative Impact

The *Minimize Maximum Relative Impact* goal strives to minimize the largest relative impact found among the scans. The *relative impact* of a scan is equal to its actual value divided by the best possible value it could get in theory.

<CopyCodeBlock content={`{ "Name": "Minimize Maximum Relative Impact", "Tolerance": 10.0 }`} language="json" />

### Minimize Update Overhead

The *Minimize Update Overhead* goal strives to minimize the update overhead of the selected indexes.

<CopyCodeBlock content={`{ "Name": "Minimize Update Overhead", "Tolerance": 0 }`} language="json" />
22 changes: 17 additions & 5 deletions indexing-engine/cp-model/rules.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,31 @@ backlink_href: /docs/indexing-engine/constraint-programming-model
backlink_title: 'Constraint Programming Model'
---

import CopyCodeBlock from '../../components/CopyCodeBlock'

Rules define limits on certain characteristics of a selection of indexes, and are applied before optimizing for the specified [goals](goals). The order of rules does not matter.

Note that applying rules that are too restrictive may cause the model to return an error, if the rule cannot be fulfilled.


### Minimum Per-Scan Cost (Normal)
### Maximum Per-Scan Cost Tolerance

**Default value: Unlimited**

The *Minimum Per-Scan Cost (Normal)* rule ensures that the cost of each *normal* (i.e., *non-priority*) scan is not worse than a given threshold w.r.t. their best possible cost. A threshold of 1 is the minimum and ensures that each normal scan assumes their best possible cost.
The *Maximum Per-Scan Cost Tolerance* rule ensures that the cost of each scan is not worse than their best possible cost (taking into account a tolerance parameter).

If the tolerance were set to 0.5 (i.e., 50%), the cost of each scan would be assured to be no worse than 150% of their best possible cost. Suppose that the best cost of a certain scan is 20. With the tolerance set to 0.5, the cost of that scan in the solution could not be worse than 150% of 20, which is 30.

If this threshold were set to 1.5 (i.e., 150%), the cost of each normal scan would be assured to be no worse than 150% of their best possible cost. Suppose that the best cost of a certain normal scan is 20. With the threshold set to 1.5, the cost of that scan in the solution could not be worse than 150% of 20, which is 30.
<CopyCodeBlock content={`"Maximum Per-Scan Cost Tolerance": 10.0`} language="json" />


### Minimum Per-Scan Cost (Priority)
### Maximum Per-Scan Impact Tolerance

**Default value: Unlimited**

The *Minimum Per-Scan Cost (Priority)* rule ensures that the cost of each priority scan is not worse than a given threshold w.r.t. their best possible cost. A threshold of 1 is the minimum and ensures that each priority scan assumes their best possible cost.
The *Maximum Per-Scan Impact Tolerance* rule ensures that the impact of each scan is not worse than their best possible impact (taking into account a tolerance parameter). See [Maximum Per-Scan Cost Tolerance](#maximum-per-scan-cost-tolerance) for an example.

<CopyCodeBlock content={`"Maximum Per-Scan Impact Tolerance": 10.0`} language="json" />


### Maximum Number of Indexes
Expand All @@ -31,13 +37,17 @@ The *Minimum Per-Scan Cost (Priority)* rule ensures that the cost of each priori

The *Maximum Number of Indexes* rule specifies a maximum number of indexes that can be selected by the model. This rule can be used in conjunction with the [Minimize Number of Indexes](goals#minimize-number-of-indexes) goal.

<CopyCodeBlock content={`"Maximum Number of Indexes": 16`} language="json" />


### Maximum Index Write Overhead

**Default value: Unlimited**

The *Maximum Index Write Overhead* rule specifies a maximum value for the total index write overhead of the indexes suggested by the model. This rule can be used in conjunction with the [Minimize Index Write Overhead](goals#minimize-index-write-overhead) goal.

<CopyCodeBlock content={`"Maximum Index Write Overhead": 1.0`} language="json" />


### Minimum Coverage

Expand All @@ -46,3 +56,5 @@ The *Maximum Index Write Overhead* rule specifies a maximum value for the total
The *Minimum Coverage* rule ensures that a portion of the coverable scans (at least as large as the value associated with this rule) are covered by the selected indexes. A *coverable scan* is a scan for which at least one index can provide coverage.

A scan is considered to be covered only if at least one of the selected indexes provides that scan with a cost improvement over its sequential read cost.

<CopyCodeBlock content={`"Minimum Coverage": 0.95`} language="json" />
16 changes: 9 additions & 7 deletions indexing-engine/cp-model/settings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,19 +55,21 @@ Each goal is defined by its name and tolerance (0.0 to ∞). The available goals
* [Minimize Number of Indexes](goals#minimize-number-of-indexes)
* [Minimize Total Cost](goals#minimize-total-cost)
* [Minimize Maximum Cost](goals#minimize-maximum-cost)
* [Minimize Maximum Relative Cost](goals#minimize-maximum-relative-cost)
* [Minimize Total Impact](goals#minimize-total-impact)
* [Minimize Maximum Impact](goals#minimize-maximum-impact)
* [Minimize Maximum Relative Impact](goals#minimize-maximum-relative-impact)
* [Minimize Update Overhead](goals#minimize-update-overhead)

The rules are defined by their name and an associated value. If a rule is not defined in the settings, it will be automatically created and it will be assigned its default value:

| Rule Name | Type | Min | Max | Default | Description |
|--------------------------------------------------------------------------|-----------|-----|-----|---------|-----------------------------------------------|
| [Minimum Per-Scan Cost (Normal)](rules#minimum-per-scan-cost-normal) | `float` | 1.0 ||| Normal scan cost threshold w.r.t. best cost |
| [Minimum Per-Scan Cost (Priority)](rules#minimum-per-scan-cost-priority) | `float` | 1.0 ||| Priority scan cost threshold w.r.t. best cost |
| [Maximum Number of Indexes](rules#maximum-number-of-indexes) | `integer` | 0 ||| Maximum number of indexes suggested |
| [Maximum Index Write Overhead](rules#maximum-index-write-overhead) | `float` | 0.0 ||| Maximum index write overhead allowed |
| [Minimum Coverage](rules#minimum-coverage) | `float` | 0.0 | 1.0 | 0.0 | Portion of coverable scans covered |
| Rule Name | Type | Min | Max | Default | Description |
|------------------------------------------------------------------------------|-----------|-----|-----|---------|------------------------------------------|
| [Maximum Per-Scan Cost Tolerance](rules#maximum-per-scan-cost-tolerance) | `float` | 0.0 ||| Scan cost tolerance w.r.t. best cost |
| [Maximum Per-Scan Impact Tolerance](rules#maximum-per-scan-impact-tolerance) | `float` | 0.0 ||| Scan impact tolerance w.r.t. best impact |
| [Maximum Number of Indexes](rules#maximum-number-of-indexes) | `integer` | 0 ||| Maximum number of indexes suggested |
| [Maximum Index Write Overhead](rules#maximum-index-write-overhead) | `float` | 0.0 ||| Maximum index write overhead allowed |
| [Minimum Coverage](rules#minimum-coverage) | `float` | 0.0 | 1.0 | 0.0 | Portion of coverable scans covered |


### Tolerance
Expand Down

0 comments on commit 064e813

Please sign in to comment.