Skip to content

Commit

Permalink
book: document location insensitive rules
Browse files Browse the repository at this point in the history
  • Loading branch information
lqd committed Dec 27, 2020
1 parent 8265ae5 commit 6efc6be
Showing 1 changed file with 92 additions and 9 deletions.
101 changes: 92 additions & 9 deletions book/src/rules/loans.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,25 @@

Loan analysis is the heart of the borrow checker, and will compute:
- illegal access errors: an action on a loan, that is illegal to perform
- illegal subset relationships errors: missing relationships between placeholder origins
- illegal subset relations errors: missing relationships between placeholder origins

This is done in multiple variants, whose goals are different: performance, readability, tests and validation.

Broadly speaking, the goals of the analysis is 1) to track loans:
- from the point and origin in which they are issued, to the points where they are invalidated
- flowing from origin to origin at a single point, via their `subset` relationships
- flowing from point to point in the CFG, according to the origins' liveness (stopping at points where a loan is killed)
- flowing from point to point in the CFG, according to the origins' liveness (stopping at points where a loan is killed)

And 2) to track undeclared relationships between placeholder origins.

Any live loan which is invalidated will be an illegal access error, any placeholder which flows into another placeholder unexpectedly will be an illegal subset relationship error.
Any live loan which is invalidated will be an illegal access error, any placeholder which flows into another placeholder unexpectedly will be an illegal subset relation error.

### Inputs

The input relations will be described below, but the [dedicated page](./relations.md) will have more information about them.

```prolog
// Indicates that the `loan` was "issued" at the given `point`, creating a
// Indicates that the `loan` was "issued" at the given `point`, creating a
// reference with the `origin`. Effectively, `origin` may refer to data from
// `loan` starting at `point` (this is usually the point *after* a borrow rvalue).
.decl loan_issued_at(Origin:origin, Loan:loan, Point:point)
Expand Down Expand Up @@ -92,7 +92,7 @@ subset(Origin1, Origin2, TargetPoint) :-

### The origins contain loans

The rules below compute what loans are contained in which origins, at given points of the CFG: starting from the "issuing point and origin", a loan is propagated via the subsets computed above, at a given point in the CFG. Liveness is then taken into account to propagate these loans along the CFG: if a loan is contained in an origin at a given point, and that the origin is live at the successor points, the loan is propagated to the successor points. A subtlety here is that there are points in the CFG where a loan can be killed, and that will stop propagation. Rule 6 uses both liveness and kill points to know whether the loan should be propagated further in the CFG.
The rules below compute what loans are contained in which origins, at given points of the CFG: starting from the "issuing point and origin", a loan is propagated via the subsets computed above, at a given point in the CFG. Liveness is then taken into account to propagate these loans along the CFG: if a loan is contained in an origin at a given point, and that the origin is live at the successor points, the loan is propagated to the successor points. A subtlety here is that there are points in the CFG where a loan can be killed, and that will stop propagation. Rule 6 uses both liveness and kill points to decide whether the loan should be propagated further in the CFG.

```prolog
.decl origin_contains_loan_on_entry(Origin:origin, Loan:loan, Point:point)
Expand Down Expand Up @@ -135,7 +135,7 @@ errors(Loan, Point) :-
loan_live_at(Loan, Point).
```

### Placeholder subsets, and illegal subset relationships errors
### Placeholder subsets, and illegal subset relations errors

These errors can be computed differently depending on the variant, but the goal is the same: if the analysis detects that a placeholder origin ultimately flows into another placeholder origin, that relationship needs to be declared or it is an error.

Expand All @@ -153,8 +153,91 @@ subset_errors(Origin1, Origin2, Point) :-
!known_placeholder_subset(Origin1, Origin2).
```

### Location Insensitive analysis

The rules above document the `Naive` variant of loan analysis, as it is conceptually simple and describes all the important parts computed by the Polonius model. This variant is "naive" in the sense that to stay clear and simple, the rules compute more things than strictly required. In particular, it computes the complete transitive subsets of all origins, as well as the loans contained by each origin at every point of the CFG.

In practice, different "grades" of borrow-checking can be useful: each with different levels of precision in what it accepts and with different computational complexity requirements. The lowest of such grades, the `LocationInsensitive` variant, trades off precision for speed by ignoring both the location where subsets happen, and the origins' contents at the CFG points. The idea is: if an analysis would find no error when ignoring path- and flow-sensitivity, then the full analysis would find no error either. If it does find potential errors, then the full analysis will find a subset of these location-insensitive errors.

This can be used as a quick pre-pass: if there a no errors, a full, expensive, analysis does not need to run, otherwise, only the loans where potential errors occur would need to be fully checked to remove false positives.

The inputs are the same as the `Naive` variant, but remove the CFG points from the `subset`s. Subsets are not tracked, and are used to approximate loan propagation inside origins (regardless of liveness and location-sensitivity) in `origin_contains_loan`:

```prolog
.decl subset(Origin1:origin, Origin2:origin)
// R1: the subsets are the non-transitive `subset_base` static input,
// with their location stripped.
subset(Origin1, Origin2) :-
subset_base(Origin1, Origin2, _).
.decl origin_contains_loan(Origin:origin, Loan:loan)
// R2: the issuing origins are the ones initially containing loans.
origin_contains_loan(Origin, Loan) :-
loan_issued_at(Origin, Loan, _).
// R3: the placeholder origins also contain their placeholder loan.
origin_contains_loan(Origin, Loan) :-
placeholder_loan(Origin, Loan).
// R4: propagate the loans from the origins to their subsets.
origin_contains_loan(Origin2, Loan) :-
origin_contains_loan(Origin1, Loan),
subset(Origin1, Origin2).
.decl loan_live_at(Loan:loan, Point:point)
// R5a: Approximate loan liveness. If an origin is live at a given
// point, and it contains a loan *anywhere* in the CFG, that loan is
// considered live at that point.
loan_live_at(Loan, Point) :-
origin_contains_loan(Origin, Loan),
(origin_live_on_entry(Origin, Point); placeholder_origin(Origin)).
.decl potential_errors(Loan:loan, Point:point)
// R5b: Compute potential illegal access errors, i.e. invalidations
// of live loans.
potential_errors(Loan, Point) :-
loan_invalidated_at(Loan, Point),
loan_live_at(Loan, Point).
```

Note: rules "5a" and "5b" above are named to match [the implementation](https://github.com/rust-lang/polonius/blob/master/polonius-engine/src/output/location_insensitive.rs) which computes `potential_errors` as a single "rule 5" without materializing the `loan_live_at` intermediate relation of "rule 5a".

Illegal subset relation errors (which are by definition about "subsets") can still be computed by propagating the placeholder loans, and detecting when they unexpectedly flow into another placeholder origin: one where this specific relationship between the two placeholders was not declared.

```prolog
.decl potential_subset_errors(Origin1:origin, Origin2:origin)
// R6: compute potential illegal subset relations errors, i.e. the
// placeholder loans which ultimately flowed into another placeholder
// origin unexpectedly.
potential_subset_errors(Origin1, Origin2) :-
placeholder(Origin1, Loan1),
placeholder(Origin2, _),
origin_contains_loan(Origin2, Loan1),
!placeholder_known_to_contain(Origin2, Loan1).
```

This requires a simple input equivalent to the transitive closure of `known_placeholder_subset`, tracking the placeholder loans a given placeholder origin is known to contain instead, and is computed like so:

```prolog
.decl placeholder_known_to_contain(Origin:origin, Loan:loan)
placeholder_known_to_contain(Origin, Loan) :-
placeholder(Origin, Loan).
placeholder_known_to_contain(Origin2, Loan1) :-
placeholder_known_to_contain(Origin1, Loan1),
known_placeholder_subset(Origin1, Origin2).
```

### To be continued

A more detailed description of the rules in the `LocationInsensitive` and `Opt` variants will be added later but they compute the same things described above:
- the `LocationInsensitive` variant tries to quickly find "potential errors" in a path- and flow-insensitive manner: if there are no errors, we don't need to do the full analysis. If there are potential errors, they could be false positives or real errors: the interesting thing is that only these loans and origins need to be checked by the full analysis.
- the `Opt` variant limits where the subset transitive closure is computed: some origins are short-lived, or part of a subsection of the subset graph into which no loan ever flows, etc. These cases don't contribute to errors or loan propagation and are not useful to track.
In the current implementation, this quick `LocationInsensitive` filter is used as a pre-pass to another optimized variant, as part of [the `Hybrid` algorithm](https://github.com/rust-lang/polonius/blob/2cf8336f7ff9932270160a392ca5be3c804b7f41/polonius-engine/src/output/mod.rs#L42).

A more detailed description of the rules in this `Opt` variant will be added later but it computes the same data as the `Naive` variant described above, more efficiently, by limiting where the subset transitive closure is computed: some origins are short-lived, or part of a subsection of the subset graph into which no loan ever flows, and therefore don't contribute to errors or loan propagation. There's no need to track these specific cases.

In the meantime, [the implementation](https://github.com/rust-lang/polonius/blob/master/polonius-engine/src/output/datafrog_opt.rs) documents the relations and rules it uses in its computation.

0 comments on commit 6efc6be

Please sign in to comment.