-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler-v2] Pattern matching for struct variants (aka enum types) #13725
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13725 +/- ##
===========================================
- Coverage 71.1% 59.0% -12.2%
===========================================
Files 2314 817 -1497
Lines 454222 197108 -257114
===========================================
- Hits 323278 116316 -206962
+ Misses 130944 80792 -50152 ☔ View full report in Codecov by Sentry. |
5202f3b
to
91e3ff3
Compare
236b742
to
2aa5ac5
Compare
x: u64 | ||
} | ||
|
||
enum Outer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add tests where enum types are attached with abilities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also some with generic types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both done in matching_ok, specifically the case where we match over instantiated generics themselves. Notice that there are already negative ability related tests in matching_ability_err, but I also added some positive ones.
/// Consider `let s = S{x: C{y}}; match (s) { S{C{y}}} if p(&y) => y, t => t }`. In order | ||
/// to check whether the match is true, the value in `s` must not be moved until the match | ||
/// is decided, while still being able to look at sub-fields and evaluate predicates. | ||
/// Therefore, the match need to evaluated first using a reference to `s`. We call this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "need to evaluated" => "needs to be evaluated"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// the expression is not a tuple, a singleton vector will be returned. This | ||
// reflects the current special semantics of tuples in Move which cannot be | ||
// nested. | ||
fn gen_tuple(&mut self, exp: &Exp, with_force_temp: bool) -> Vec<TempIndex> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn gen_tuple(&mut self, exp: &Exp, with_force_temp: bool) -> Vec<TempIndex> { | |
fn gen_tuple(&mut self, exp: &Exp, with_forced_temp: bool) -> Vec<TempIndex> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to keep it consistent with other similar uses in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Given that my comments are minor, I am approving the PR assuming the comments will be considered.
exit_path: Label, | ||
/// Set if this is a probing match, and if so, the set of vars which need to | ||
/// be bound for the match condition. In probing mode, we only need to | ||
/// bind certain variables. For example, in `S{x, y} if p(y)` we do not need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would y
be a probing var in both these cases: p(y)
and p(&y)
? I ask because the example at the top is about p(&y)
and the one here is p(y)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends, if y
is not copyable, one better uses &y
, but the logic described here is independent of this.
enum ValueShape { | ||
Any, | ||
Tuple(Vec<ValueShape>), | ||
Struct(QualifiedId<StructId>, Option<Symbol>, Vec<ValueShape>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be helpful to add some notes about this variant's fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// which show the counterexamples of missed patterns, but only up to the shape as the user | ||
/// has inspected values via patterns. For example, if the user has written `R{S{_}, _}` | ||
/// the counter example should not contain irrelevant other parts of the value (as e.g. | ||
/// in `R{T{x:_}, S}`). Moreover, missing patterns which can never be matched should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"missing patterns which can never be matched" is a bit confusing => if there is a missing pattern (there exists atleast one value being matched that does not fit any match arm), it is an error, but there is no dead code; but if one pattern represents the superset of the other and appears first, then the other arm can never be matched (and therefore we have dead code). Would be helpful to clarify this in comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// by lowest 6 bytes of a sha256 of the string "Move 2 Abort Code", | ||
/// appended with two bytes for the error type. | ||
pub const WELL_KNOWN_ABORT_CODE_BASE: u64 = 0xD8CA26CBD9BE << 16; | ||
pub const INCOMPLETE_MATCH_ABORT_CODE: u64 = WELL_KNOWN_ABORT_CODE_BASE | 0x1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: We allow user to also directly abort with this code (using abort, or assert)? Can this cause any issues (by using the abort codes in the user space)? As far as I understand, arithmetic errors, for example, are always distinguishable as an outcome from user aborts; should we have the same here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@well, I would consider the probability of a clash extremely unlikely. In any case we need this or extend the VM and storage to add more info for abort code (storage breaking change for on-chain state). We can detect though at runtime if user issues reserved abort codes. I added a TODO.
@@ -159,6 +160,12 @@ pub enum Operation { | |||
MoveFrom(ModuleId, StructId, Vec<Type>), | |||
Exists(ModuleId, StructId, Vec<Type>), | |||
|
|||
// Variants | |||
TestVariant(ModuleId, StructId, Symbol, Vec<Type>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be helpful to have documentation for the fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
x: u64 | ||
} | ||
|
||
enum Outer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also some with generic types.
…evel. The PR consists of the following parts: 1. *Extending the Bytecode* There are a few new bytecode operations: ``` dest := TestVariant(struct_id, variant, src) dest := PackVariant(struct_id, variant, srcs) dests := UnpackVariant(struct_id, variant, src) dest := BorrowFieldVariant(struct_id, variant, field, src) ``` The `TestVariant` and `BorrowFieldVariant` operations expect references to the variant struct. The `UnpackVariant` and `BorrowFieldVariant` operations abort if the `src` value is not a value of the given variant. Notably, there are no new control flow operations, and the remaining part of the stackless bytecode framework should operate without changes. This may change in the future if we introduce a new branch instruction for switching over variants. 2. *Translating Matches* Matches are translated into cascades of test & branch instructions. The translation is complicated by the need for 'probing' whether a value can be matched before it is consumed. See for discussion in the code how this problem is solved. The current translation is manually reviewed via the baseline output, but it will not be fully tested before we are able to run e2e tests with VM execution. (Future PRs.) 3. *Checking Match Coverage* Even though we generate robust match code which is resilient against addition of new match variants by explicit `abort` if there is no match, at compile time complete match coverage is required. Checking for this is non-trivial because of the sequential, imperative semantics of matching. See documentation in the code. 4. *Other* There are some only indirectly related changes in this PR. To implement non-destructive matches with conditions, it appeared handy to have `*&x` to be equivalent to `x`. This seems to be general useful and is AFAICT also in Rust. Also, the new pattern translation is a bit more efficient also in the `let` case, leading to some changes in baseline files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews, @rahxephon89 PTAL
// the expression is not a tuple, a singleton vector will be returned. This | ||
// reflects the current special semantics of tuples in Move which cannot be | ||
// nested. | ||
fn gen_tuple(&mut self, exp: &Exp, with_force_temp: bool) -> Vec<TempIndex> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// Consider `let s = S{x: C{y}}; match (s) { S{C{y}}} if p(&y) => y, t => t }`. In order | ||
/// to check whether the match is true, the value in `s` must not be moved until the match | ||
/// is decided, while still being able to look at sub-fields and evaluate predicates. | ||
/// Therefore, the match need to evaluated first using a reference to `s`. We call this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
exit_path: Label, | ||
/// Set if this is a probing match, and if so, the set of vars which need to | ||
/// be bound for the match condition. In probing mode, we only need to | ||
/// bind certain variables. For example, in `S{x, y} if p(y)` we do not need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends, if y
is not copyable, one better uses &y
, but the logic described here is independent of this.
/// which show the counterexamples of missed patterns, but only up to the shape as the user | ||
/// has inspected values via patterns. For example, if the user has written `R{S{_}, _}` | ||
/// the counter example should not contain irrelevant other parts of the value (as e.g. | ||
/// in `R{T{x:_}, S}`). Moreover, missing patterns which can never be matched should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
enum ValueShape { | ||
Any, | ||
Tuple(Vec<ValueShape>), | ||
Struct(QualifiedId<StructId>, Option<Symbol>, Vec<ValueShape>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
x: u64 | ||
} | ||
|
||
enum Outer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both done in matching_ok, specifically the case where we match over instantiated generics themselves. Notice that there are already negative ability related tests in matching_ability_err, but I also added some positive ones.
@@ -159,6 +160,12 @@ pub enum Operation { | |||
MoveFrom(ModuleId, StructId, Vec<Type>), | |||
Exists(ModuleId, StructId, Vec<Type>), | |||
|
|||
// Variants | |||
TestVariant(ModuleId, StructId, Symbol, Vec<Type>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// by lowest 6 bytes of a sha256 of the string "Move 2 Abort Code", | ||
/// appended with two bytes for the error type. | ||
pub const WELL_KNOWN_ABORT_CODE_BASE: u64 = 0xD8CA26CBD9BE << 16; | ||
pub const INCOMPLETE_MATCH_ABORT_CODE: u64 = WELL_KNOWN_ABORT_CODE_BASE | 0x1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@well, I would consider the probability of a clash extremely unlikely. In any case we need this or extend the VM and storage to add more info for abort code (storage breaking change for on-chain state). We can detect though at runtime if user issues reserved abort codes. I added a TODO.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in `move-compiler-v2/tests/file-format-generator/struct_variants.move` which shows the disassembeled output of various match expressions. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code from scratch (like a `CompiledModuleBuilder` or similar) which is planned to be done in a followup PR. Unfortunately, `.mvir` files cannot be used for this purpose, as there is no plan to support struct variants in this language right now.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in `move-compiler-v2/tests/file-format-generator/struct_variants.move` which shows the disassembeled output of various match expressions. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code from scratch (like a `CompiledModuleBuilder` or similar) which is planned to be done in a followup PR. Unfortunately, `.mvir` files cannot be used for this purpose, as there is no plan to support struct variants in this language right now.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in `move-compiler-v2/tests/file-format-generator/struct_variants.move` which shows the disassembeled output of various match expressions. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code from scratch (like a `CompiledModuleBuilder` or similar) which is planned to be done in a followup PR. Unfortunately, `.mvir` files cannot be used for this purpose, as there is no plan to support struct variants in this language right now.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in `move-compiler-v2/tests/file-format-generator/struct_variants.move` which shows the disassembeled output of various match expressions. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code from scratch (like a `CompiledModuleBuilder` or similar) which is planned to be done in a followup PR. Unfortunately, `.mvir` files cannot be used for this purpose, as there is no plan to support struct variants in this language right now.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
Description
This implements the Move 2
match
expression on stackless bytecode level. The PR consists of the following parts:There are a few new bytecode operations:
The
TestVariant
andBorrowFieldVariant
operations expect references to the variant struct. TheUnpackVariant
andBorrowFieldVariant
operations abort if thesrc
value is not a value of the given variant.Notably, there are no new control flow operations, and the remaining part of the stackless bytecode framework should operate without changes. This may change in the future if we introduce a new branch instruction for switching over variants.
Matches are translated into cascades of test & branch instructions. The translation is complicated by the need for 'probing' whether a value can be matched before it is consumed. See for discussion in the code how this problem is solved.
The current translation is manually reviewed via the baseline output, but it will not be fully tested before we are able to run e2e tests with VM execution. (Future PRs.)
Even though we generate robust match code which is resilient against addition of new match variants by explicit
abort
if there is no match, at compile time complete match coverage is required. Checking for this is non-trivial because of the sequential, imperative semantics of matching. See documentation in the code.There are some only indirectly related changes in this PR. To implement non-destructive matches with conditions, it appeared handy to have
*&x
to be equivalent tox
. This seems to be general useful and is AFAICT also in Rust.Also, the new pattern translation is a bit more efficient also in the
let
case, leading to some changes in baseline files.Type of Change
Which Components or Systems Does This Change Impact?