diff --git a/language/tools/move-package/README.md b/language/tools/move-package/README.md index 58ac5334e3..fdffc87aa1 100644 --- a/language/tools/move-package/README.md +++ b/language/tools/move-package/README.md @@ -34,28 +34,102 @@ packages. It is also responsible for ensuring that all named addresses have a value assigned to them, and that there are no conflicting assignments. The package graph is rooted at the package being built and is a DAG. -When building the package graph we do the following conceptual operations: -verify that dependencies exist at the declared locations, that their -package names and source digests match (if applicable), clone git -dependencies if they don't already exist locally, build a dependency graph -of Move packages and ensure this forms a DAG, compute an -assignment for each named address in each Move package in the package -graph, and ensure that the resulting named address assignment is valid. - -All of the above steps are fairly straightforward, with the possible -exception of named addresses: each package will have a set of in-scope -named addresses. The set of in-scope named addresses for a package `P` is -defined as the transitive closure of all named addresses in the -dependencies of `P`. Additionally, a package can rename named addresses -that are in-scope as long as the final assignment of a value to the set of -named addresses can be unified. To ensure that named addresses are -unifiable across renamings, resolution performs unification across named -addresses using a `Rc>>`: when a named address first -enters scope in the package graph a `Rc>` is created for it. -This refcell is then shared to all uses of the named address _even across -renamings_ and when a value is assigned to it, the value must (1) match the -current value contained within the `Option`, or (2) the `Option` is `None`, -and the value is placed into the refcell. +Discovering the full set of transitive dependencies (including +dev-dependencies), regardless of the current build configuration +results in a `DependencyGraph` which can optionally be serialized into +(and deserialized out of) a `LockFile`. If a `DependencyGraph` can be +created, it is guaranteed to: + +- Be acyclic. + +- Contain no conflicting dependencies, where the same package name + is required to come from two distinct sources. + +- Have well-nested relative local dependencies, where for all + dependency chains `R -> L0 -> L1 -> ... -> Ln` with `R` being + remote, `Li` being local with a relative path, and `X -> Y` meaning + `X` depends on `Y`, the paths of all `Li`s are sub-directories of + the repository containing `R` (but not necessarily sub-directories + of each other). + +The logic for exploring transitive dependencies is found in +[`./src/resolution/dependency_graph.rs`](./src/resolution/dependency_graph.rs), +and the logic for lock files (creation, commit, schema) is found in +the [`./src/resolution/lock_file`](./src/resolution/lock_file) +directory. + +A `DependencyGraph` is further processed into a `ResolvedGraph` which +is specific to the current build configuration (e.g. only includes +dev-dependencies and dev-address assignments if dev-mode is enabled), +and further includes the following for each package: + +- Its source digest, which is a hash of its manifest and source files. + +- Its "renaming" table which includes in-scope addresses that + originate from dependencies but have been renamed. + +- Its "resolution table", which is a total mapping from its in-scope + named addresses to numerical addresses. + +If a `ResolvedGraph` can be created, it guarantees that: + +- All packages exist at their sources, and all of them are available + locally (fetched from remote sources such as git if necessary). + +- Package source digests match the source digests (if supplied) for + the dependencies they satisfy. + +- All packages have valid renaming tables, where all their bindings + refer to valid addresses in their dependencies and introduce + bindings that do not overlap with other renamings for the same + package. + +- A complete named address assignment exists, wherein every named + address in scope for every package is bound to some numerical + address. + +- A consistent named address assignment exists, wherein if address `A` + in package `P` is equivalent to address `B` in package `Q`, then `A` + and `B` are assigned the same numerical address. Informally, two + named addresses (across packages) are equivalent if they are related + by scope or renamings. Formally, for packages `P`, `Q` and named + addresses `x`, `y`, this equivalence is the transitive reflexive + closure of the relation that corresponds `x` in `P` to `y` in `Q` + when, + - `P` depends on `Q`, + - `y` is in scope in `Q`, + - `P`'s renaming binds `x` to `(Q, y)`, or + - `x` is not in `P`'s renaming, and `x = y`. + +Named address assignment is implemented by unification, so if: + +- `P` depends on `Q`, renaming its `QA` to `PA`, and assigns `0x42` to `PA`. +- `Q` depends on `R`, renaming its `RA` to `QA`, +- `R` introduces unbound address `RA`. + +This results in a complete and consistent named address assignment where, + +- `PA = 0x42` in `P`'s resolution table, +- `QA = 0x42` in `Q`'s resolution table, +- `RA = 0x42` in `R`'s resolution table, + +even though only one concrete name was assigned. Similarly, if: + +- `P` depends on `Q`, and assigns `0x42` to `QA`, +- `P` depends on `R`, and assigns `0x43` to `RA`, +- `Q` depends on `S` renaming its `SA` to `QA`, +- `R` depends on `S` renaming its `SA` to `RA`, +- `S` introduces unbound address `SA`. + +This results in an inconsistent named address assignment because it +requires `SA` to be bound to both `0x42` and `0x43`. + +[`./src/resolution/resolution_graph.rs`](./src/resolution/resolution_graph.rs) +defines `ResolvedGraph` and creating one from a `DependencyGraph`, +with support for named address resolution (unification) in +[`./src/resolution/resolving_table.rs`](./src/resolution/resolving_table.rs) +and for calculating source digests in +[`./src/resolution/digest.rs`](./src/resolution/digest.rs). ## Compilation