-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[compiler-v2] Enum types in the VM #13812
Conversation
Bytecode::PackVariant(a) => write!(f, "PackVariant({})", a), | ||
Bytecode::TestVariant(a) => write!(f, "TestVariant({})", a), | ||
Bytecode::PackVariantGeneric(a) => write!(f, "PackVariantGeneric({})", a), | ||
Bytecode::TestVariantGeneric(a) => write!(f, "TestVariantGeneric({})", a), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I highly doubt if it is still a good idea to have generic and non-generic versions of the same instructions.
Back in the old days, we split some instructions into generic & non-generic versions, hoping that this would make the bytecode more optimized for the VM to execute. However now it's clear this point is totally moot.
- There is a strong need for the bytecode format to be a stable, backward compatible storage format, which conflicts with being optimized for execution
- A more suitable solution to this problem is to introduce some VM-internal microcodes that are designed to be unstable, adapting to VM changes to the VM's execution engine. Notice that this is exactly how modern CPUs architectures work.
- Because of the two versions, we ended up having to duplicate a ton of code. This has been a maintenance nightmare and greatly increased the chance we screw something up (e.g. making changes in only one version)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is no redesigning the bytecode architecture, but following along what we have. One principle of the current design is that almost every instruction has exactly one argument, therefore fits into three bytes in most cases (one for the bytecode, one for the argument, one terminator). This is not for efficiency of execution but for the size of the bytecode. (You are right that for performance of execution it makes little sense.) The entire way how the file format is designed is for this.
If we want to change this, we can't do it just for two out of dozens instructions. The resulting design would look very, very odd. So we have BorrowField and BorrowFieldGeneric, and symmetrically, we have BorrowVariantField and BorrowVariantFieldGeneric. I don't think that any other approach leads to a satisfactory design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. Agreed that we could try to come up with a more appropriate solution to this problem later.
179576d
to
da7e31d
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13812 +/- ##
===========================================
- Coverage 70.8% 59.1% -11.8%
===========================================
Files 2332 824 -1508
Lines 462140 200490 -261650
===========================================
- Hits 327540 118635 -208905
+ Misses 134600 81855 -52745 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few real problems: Redundant rep in file_format could expand space as well as being a potential attack vector. Somewhere I found an inconsistency in one of your boilerplate patterns. Others are minor.
@@ -449,6 +594,13 @@ impl<'a> BoundsChecker<'a> { | |||
)?; | |||
} | |||
}, | |||
PackVariantGeneric(idx) | UnpackVariantGeneric(idx) | TestVariantGeneric(idx) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous clause (PackGeneric(idx)|...
) does a check of type parameters. Don't you need something similar here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, done.
pub const STRUCT_DEF_INST_INDEX_MAX: u64 = TABLE_INDEX_MAX; | ||
pub const CONSTANT_INDEX_MAX: u64 = TABLE_INDEX_MAX; | ||
pub const STRUCT_VARIANT_HANDLE_INDEX_MAX: u64 = TABLE_INDEX_MAX; | ||
pub const STRUCT_VARIANT_INST_INDEX_MAX: u64 = TABLE_INDEX_MAX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we had a canonical ordering for these declarations, so it would be easier to see if one was missing. But I guess that's not easy, unless we just go alphabetical.
@@ -305,6 +305,11 @@ impl Struct { | |||
StructFieldInformation::Declared(fields) => { | |||
fields.iter().map(|f| Field::new(m, f)).collect() | |||
}, | |||
StructFieldInformation::DeclaredVariants(..) => { | |||
// TODO(#13806): consider implementing (probably not as long we do not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the editorial comment ("probably not as long as we do not know whether normalized.rs will be retired") should go in the Issue for more discussion, not the comment. This comment may live forever. And maybe you want a more specific issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is a cluster tracking bug, the idea is we search the code for TODO(#nnn)
. This method is common at Google and Meta which have good web code search, works also well in my IDE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also seen panic!("TODO(#13806): blah blah")
which i did like :)
no action needed here - just a comment
@@ -201,6 +201,11 @@ fn write_struct_def(ctx: &mut Context, sdef: &StructDefinition) -> String { | |||
return out; | |||
}, | |||
StructFieldInformation::Declared(fields) => fields, | |||
StructFieldInformation::DeclaredVariants(..) => { | |||
// TODO(#13806): consider implement if interface generator will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need an interface generator to build against on-chain dependences which are only available as bytecode? Better if these issues can be written somewhere (like dedicated issues) for more discussion. But I guess they'll all be removed by the time you're done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said the tracking bug is a better way to deal with this at this stage, IMO. The code is always better than a secondary artifact (learned this at Google).
At this point we do not even know whether this code is used anywhere in a serious manner. As discussed in the design review, API folks will look at how they adapt once the feature is ready here. I think their API generator directly works on file format representation, and not compiler v1 stuff.
@@ -101,6 +101,10 @@ fn instantiation_err() { | |||
], | |||
}), | |||
}], | |||
struct_variant_handles: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the deal is with move-vm/integration-tests/, but it seems worthwhile to have some demo variant here eventually, so add a comment to add one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -215,6 +215,10 @@ fn make_module_with_function( | |||
code: vec![Bytecode::LdU64(0), Bytecode::Abort], | |||
}), | |||
}], | |||
struct_variant_handles: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the deal is with move-vm/integration-tests/, but it seems worthwhile to have some demo variant here eventually, so add a comment to add one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
variant: Option<Symbol>, | ||
offset: usize, | ||
) -> FieldEnv<'env> { | ||
// TODO: speed this up via a cache RefCell<BTreeMap<(variant, offset), FieldId>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not worth an issue, so I removed the TODO. and just left as a comment that this can be sped up. It's a linear search over less then a dozen elements in average, so its OK for the model.
@@ -485,6 +485,10 @@ pub fn compile_module<'a>( | |||
metadata: vec![], | |||
struct_defs, | |||
function_defs, | |||
struct_variant_handles: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO and point to an issue "If we are keeping the move-ir-compiler we need to do something here".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I see you have a lot of changes to move-bytecode-source-map. Why there and not here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source map is used all over our products (its also stored on chain), but Move IR is nowhere used anymore besides some legay tests. I'm not sure whether we want to keep it. Added a TODO linking to the tracking bug.
@@ -402,6 +406,19 @@ impl AbstractState { | |||
Ok(AbstractValue::NonReference) | |||
} | |||
|
|||
pub fn test_variant( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this function testing? Add a comment line, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this function is not needed. It behaves like read_ref (function before, also not documented), but we better call just directly read_ref, because that is the abstract semantics of testing a variant (the reference is used to read access the tag in the variant data)
pub owner: StructDefinitionIndex, | ||
/// The sequence of variants which share the field at the given | ||
/// field offset. | ||
pub variants: Vec<VariantCount>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variants
field is redundant with the VariantDefinition
field fields
, as it should be the inverse. If you keep both, you need a check for that on all input paths. (Maybe you do have, this PR is too big to tell). OTOH, if you get rid of this one, you can reconstruct it at deserialization and reduce the file size a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the operation is more targeted. It can express selecting a field from just one given variant or from a set of variants. For example, if used in a match expression context, the list would contain only the variant we have tested for.
It is possible we relax this and just allow it where ever possible based on the StructDef metadata. Lets consider this an option for a later refinement when we understand all the relevant use cases on source level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wrwg So this means that we can have multiple VariantFieldHandle with the same owner? Just confirming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can if the variant count/index sets are different. I'm open to change this, as said, after we land the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @brmataptos . Please check my answer in the comments about the redundancy problem. I agree this can be a problem.
@@ -449,6 +594,13 @@ impl<'a> BoundsChecker<'a> { | |||
)?; | |||
} | |||
}, | |||
PackVariantGeneric(idx) | UnpackVariantGeneric(idx) | TestVariantGeneric(idx) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, done.
pub owner: StructDefinitionIndex, | ||
/// The sequence of variants which share the field at the given | ||
/// field offset. | ||
pub variants: Vec<VariantCount>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the operation is more targeted. It can express selecting a field from just one given variant or from a set of variants. For example, if used in a match expression context, the list would contain only the variant we have tested for.
It is possible we relax this and just allow it where ever possible based on the StructDef metadata. Lets consider this an option for a later refinement when we understand all the relevant use cases on source level.
@@ -305,6 +305,11 @@ impl Struct { | |||
StructFieldInformation::Declared(fields) => { | |||
fields.iter().map(|f| Field::new(m, f)).collect() | |||
}, | |||
StructFieldInformation::DeclaredVariants(..) => { | |||
// TODO(#13806): consider implementing (probably not as long we do not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is a cluster tracking bug, the idea is we search the code for TODO(#nnn)
. This method is common at Google and Meta which have good web code search, works also well in my IDE.
@@ -402,6 +406,19 @@ impl AbstractState { | |||
Ok(AbstractValue::NonReference) | |||
} | |||
|
|||
pub fn test_variant( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this function is not needed. It behaves like read_ref (function before, also not documented), but we better call just directly read_ref, because that is the abstract semantics of testing a variant (the reference is used to read access the tag in the variant data)
@@ -201,6 +201,11 @@ fn write_struct_def(ctx: &mut Context, sdef: &StructDefinition) -> String { | |||
return out; | |||
}, | |||
StructFieldInformation::Declared(fields) => fields, | |||
StructFieldInformation::DeclaredVariants(..) => { | |||
// TODO(#13806): consider implement if interface generator will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said the tracking bug is a better way to deal with this at this stage, IMO. The code is always better than a secondary artifact (learned this at Google).
At this point we do not even know whether this code is used anywhere in a serious manner. As discussed in the design review, API folks will look at how they adapt once the feature is ready here. I think their API generator directly works on file format representation, and not compiler v1 stuff.
@@ -485,6 +485,10 @@ pub fn compile_module<'a>( | |||
metadata: vec![], | |||
struct_defs, | |||
function_defs, | |||
struct_variant_handles: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source map is used all over our products (its also stored on chain), but Move IR is nowhere used anymore besides some legay tests. I'm not sure whether we want to keep it. Added a TODO linking to the tracking bug.
variant: Option<Symbol>, | ||
offset: usize, | ||
) -> FieldEnv<'env> { | ||
// TODO: speed this up via a cache RefCell<BTreeMap<(variant, offset), FieldId>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not worth an issue, so I removed the TODO. and just left as a comment that this can be sped up. It's a linear search over less then a dozen elements in average, so its OK for the model.
@@ -101,6 +101,10 @@ fn instantiation_err() { | |||
], | |||
}), | |||
}], | |||
struct_variant_handles: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -215,6 +215,10 @@ fn make_module_with_function( | |||
code: vec![Bytecode::LdU64(0), Bytecode::Abort], | |||
}), | |||
}], | |||
struct_variant_handles: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only commented on files I have viewed.
Ok(()) | ||
} | ||
|
||
fn check_table<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for breaking the convention of repeating code.
Would be even better if rust supported generic closures so that we can have this as a lambda inside the function above.
@@ -104,6 +112,10 @@ pub enum TableType { | |||
FIELD_INST = 0xE, | |||
FRIEND_DECLS = 0xF, | |||
METADATA = 0x10, | |||
VARIANT_FIELD_HANDLE = 0x11, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most tables are named XYZ_HANDLES (only one FIELD_HANDLE - signular)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed them all (both mine and the old ones)
@@ -305,6 +305,11 @@ impl Struct { | |||
StructFieldInformation::Declared(fields) => { | |||
fields.iter().map(|f| Field::new(m, f)).collect() | |||
}, | |||
StructFieldInformation::DeclaredVariants(..) => { | |||
// TODO(#13806): consider implementing (probably not as long we do not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also seen panic!("TODO(#13806): blah blah")
which i did like :)
no action needed here - just a comment
/// - `field_count` as a ULEB128 (number of fields defined in the type) | ||
/// - `fields` as a ULEB128 (index into the `FieldDefinition` table) | ||
/// - if DeclaredVariants (since v7): | ||
/// - `field_count` as a ULEB128 (number of fields defined in the type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something wrong with this comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not only my one, but also the existing one... There is no FieldDefinitionTable,, rather field definitions are embeed in the StructDef.
Removed the entire comment block. If someone wants to know the serialization format, they have to look at the code anyway. Other functions do not have this comment either.
}); | ||
} | ||
Ok(()) | ||
fn load_variant(cursor: &mut VersionedCursor) -> BinaryLoaderResult<VariantDefinition> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd have this inline in load_variants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd agree, just following convention in this module, where every type has its own loader. But I added #[inline(always)]
;-)
pub owner: StructDefinitionIndex, | ||
/// The sequence of variants which share the field at the given | ||
/// field offset. | ||
pub variants: Vec<VariantCount>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wrwg So this means that we can have multiple VariantFieldHandle with the same owner? Just confirming
pub owner: StructDefinitionIndex, | ||
/// The sequence of variants which share the field at the given | ||
/// field offset. | ||
pub variants: Vec<VariantCount>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe define VariantIndex,Vec<VariantCount>
reads weird (misleading)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The existing code uses FieldCount
for the index, not every bad thing need to be adapted. Renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we testing the ser/de logic? IIRC we had a proptest logic that randomly generates bytecode and we test ser/de round trips with those generated bytecode. Should we update the generation logic? ( I didn't see the changes in the PR)
#[cfg_attr(any(test, feature = "fuzzing"), proptest(no_params))] | ||
#[cfg_attr(feature = "fuzzing", derive(arbitrary::Arbitrary))] | ||
pub struct StructVariantHandle { | ||
pub struct_index: StructDefinitionIndex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I'm not sure if I get this. The match instruction takes a StructVariantHandleIndex
, which refers to StructDefinitionIndex
. Does that mean that the match instruction could only be used at the defining module? As the StructDefinition
should only be available for the defining module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, right now, match can only happen in the defining module.
We should make public/public structs together in one strike. In that case, StructDefinition would be imported from another module and embedded in the current module. Then everything we do today locally for structs (and enums) should become available publicly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. So public struct/enum will NOT be part of this PR and do you suggest we should bundle this feature in the same release or in a future release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. They are not part of this PR.
assert ty == &mut struct_ty | ||
ty_stack << &mut field_ty | ||
"#] | ||
ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are using test instruction to pair with destruct so that we don't need to implement a jump table instead right? I think victor suggested that this could have some gas implication as later variant might cost more than the earlier ones as you have more instructions to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually have Jump tables in my original design doc, and they could be added at a later stage. However, it is not that obvious how they would help, and we need the basic instructions anyway. Consider
match (e) {
C1(C2(_)) => e1,
C1(C3(x)) if p(x) => e2,
C2(C3(_)) => e3.
etc
}
Not that obvious how to create a jump table for stuff like this.
The basic underlying implementation right now are decision trees based on predicates. We can add jump tables on top of that for the cases we can identify an advantage, but the basic operations are needed.
api/types/src/bytecode.rs
Outdated
@@ -109,6 +109,10 @@ pub trait Bytecode { | |||
.map(|f| self.new_move_struct_field(f)) | |||
.collect(), | |||
), | |||
StructFieldInformation::DeclaredVariants(..) => { | |||
// TODO(#13806): implement for struct variants | |||
panic!("enums/struct variants not yet supported by the API") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will panic crash the node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I added a default behavior instead.
@@ -45,6 +45,15 @@ crate::gas_schedule::macros::define_gas_parameters!( | |||
[mut_borrow_field: InternalGas, "mut_borrow_field", 735], | |||
[imm_borrow_field_generic: InternalGas, "imm_borrow_field_generic", 735], | |||
[mut_borrow_field_generic: InternalGas, "mut_borrow_field_generic", 735], | |||
[imm_borrow_variant_field: InternalGas, "imm_borrow_variant_field", 835], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we ok with those constants or we need to run another round of gas calibration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably do need to calibrate this, though I derived them from the existing field operations, adding some 12% extra cost for the need of variant tag check.
@@ -308,6 +308,31 @@ fn native_format_impl( | |||
)?; | |||
out.push('}'); | |||
}, | |||
MoveTypeLayout::Struct(MoveStructLayout::RuntimeVariants(variants)) => { | |||
let strct = val.value_as::<Struct>()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to feature gate this change? I think we don't because the variant is only going to be relavent once bytecode v7 is enabled. By the time v7 flag is enabled we should expect all nodes to have this change already. Just wanting to make sure my reasoning is correct here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume right now we do not, since as you said, this is guarded by the v7 flag, which behaves like a feature gate here.
@@ -729,3 +735,10 @@ impl_view_internals!(FieldDefinitionView, FieldDefinition, field_def); | |||
impl_view_internals!(TypeSignatureView, TypeSignature, type_signature); | |||
impl_view_internals!(SignatureView, Signature, signature); | |||
impl_view_internals!(SignatureTokenView, SignatureToken, token); | |||
|
|||
/// A type represent eoither a FieldHandleIndex or a VariantFieldHandleIndex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made almost full pass, mainly looked at VM-related code
@@ -729,3 +735,10 @@ impl_view_internals!(FieldDefinitionView, FieldDefinition, field_def); | |||
impl_view_internals!(TypeSignatureView, TypeSignature, type_signature); | |||
impl_view_internals!(SignatureView, Signature, signature); | |||
impl_view_internals!(SignatureTokenView, SignatureToken, token); | |||
|
|||
/// A type represent eoither a FieldHandleIndex or a VariantFieldHandleIndex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I am not a fan of such comments because the name is sufficient, and if variants are changed/renamed, it is no longer correct. Maybe remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a fan of it, and I do it in my own code with every public type/function even if redundant.
My experience is if you don't, you establish ambiguous code style. The next guy come and change the code will not add any comments even though they should have. A simple rule 'add /// comment to every public thing' is easy to understand, a code style 'document when it is not obvious' is open for all kinds of interpretation.
@@ -300,6 +336,11 @@ struct ModuleSerializer { | |||
field_handles: (u32, u32), | |||
field_instantiations: (u32, u32), | |||
friend_decls: (u32, u32), | |||
// Since v7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add more info, like "bytecode version 7". New person looking at this has no context what v7 is. I think applies in a few other places as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Did a global replace and now its // Since bytecode version 7
everywhere
@@ -606,7 +669,7 @@ fn serialize_struct_def_instantiation( | |||
Ok(()) | |||
} | |||
|
|||
/// Serializes `FieldDefinition` within a struct. | |||
/// Serializes `FieldDefinition` list within. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment seems off, why remove "a struct"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -616,11 +679,6 @@ fn serialize_field_definitions(binary: &mut BinaryData, fields: &[FieldDefinitio | |||
} | |||
|
|||
/// Serializes a `FieldDefinition`. | |||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this the first line as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -84,6 +90,8 @@ pub const SIGNATURE_TOKEN_DEPTH_MAX: usize = 256; | |||
/// | |||
/// The binary contains a subset of those tables. A table specification is a tuple (table type, | |||
/// start offset, byte count) for a given table. | |||
/// | |||
/// Notice there must not be more than `max(u8) - 1` tables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a test which asserts this? if not, we should probably have one so that such an invariant is always enforced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No not such tests. At least I added this comment which wasn't there before ;-) (While the issue was there before as well).
BTW, its not really an invariant more like a restricting of the current design
}, | ||
BinaryType::Script(_) => unreachable!("Scripts cannot have field instructions"), | ||
} | ||
} | ||
|
||
pub(crate) fn create_struct_type(&self, struct_ty: &Arc<StructType>) -> Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use ty
for types? So create_struct_ty
instead, I think in a couple of other places as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
.create_struct_ty(struct_ty.idx, AbilityInfo::struct_(struct_ty.abilities)) | ||
} | ||
|
||
pub(crate) fn create_struct_type_instantiation( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
let formulas = match &struct_type.layout { | ||
StructLayout::Single(fields) => fields | ||
.iter() | ||
.map(|(_, field_type)| self.calculate_depth_of_type(field_type, module_store)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: field --> field_ty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
layout | ||
}, | ||
StructLayout::Variants(variants) => { | ||
// We do not support variants to have direct identifier mappings, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enforced? Can I have
enum Foo {
Bar(Aggregator),
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well you tell me, this code is not well documented, and the aggregator integration is a mystery (mostly).
For structs someone left this comment:
// For aggregators / snapshots, the first field should be lifted.
if let Some(kind) = &maybe_mapping {
I'm lost on whether this logic applies to variants as well. And what is the logic? Every first field of a struct which is of type Aggregator?
This non-standard semantics should have been documented here.
pub(crate) definition_struct_type: Arc<StructType>, | ||
} | ||
|
||
// A field instantiation. The offset is the only used information when operating on a field | ||
#[derive(Clone, Debug)] | ||
pub(crate) struct FieldInstantiation { | ||
pub(crate) offset: usize, | ||
pub(crate) uninstantiated_ty: Type, | ||
pub(crate) definition_struct_type: Arc<StructType>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have consistency between type
and ty
. The latter seems better to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews! Addressed most comments. Open question for @georgemitenkov is how to correctly deal with aggregators and struct variants and this type lifting (see code).
api/types/src/bytecode.rs
Outdated
@@ -109,6 +109,10 @@ pub trait Bytecode { | |||
.map(|f| self.new_move_struct_field(f)) | |||
.collect(), | |||
), | |||
StructFieldInformation::DeclaredVariants(..) => { | |||
// TODO(#13806): implement for struct variants | |||
panic!("enums/struct variants not yet supported by the API") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I added a default behavior instead.
@@ -45,6 +45,15 @@ crate::gas_schedule::macros::define_gas_parameters!( | |||
[mut_borrow_field: InternalGas, "mut_borrow_field", 735], | |||
[imm_borrow_field_generic: InternalGas, "imm_borrow_field_generic", 735], | |||
[mut_borrow_field_generic: InternalGas, "mut_borrow_field_generic", 735], | |||
[imm_borrow_variant_field: InternalGas, "imm_borrow_variant_field", 835], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably do need to calibrate this, though I derived them from the existing field operations, adding some 12% extra cost for the need of variant tag check.
@@ -308,6 +308,31 @@ fn native_format_impl( | |||
)?; | |||
out.push('}'); | |||
}, | |||
MoveTypeLayout::Struct(MoveStructLayout::RuntimeVariants(variants)) => { | |||
let strct = val.value_as::<Struct>()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume right now we do not, since as you said, this is guarded by the v7 flag, which behaves like a feature gate here.
}); | ||
} | ||
Ok(()) | ||
fn load_variant(cursor: &mut VersionedCursor) -> BinaryLoaderResult<VariantDefinition> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd agree, just following convention in this module, where every type has its own loader. But I added #[inline(always)]
;-)
pub owner: StructDefinitionIndex, | ||
/// The sequence of variants which share the field at the given | ||
/// field offset. | ||
pub variants: Vec<VariantCount>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can if the variant count/index sets are different. I'm open to change this, as said, after we land the PR.
pub(crate) struct StructVariantInfo { | ||
// struct field count | ||
pub(crate) field_count: u16, | ||
pub(crate) variant: VariantCount, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already merged instantiation and non-instantiation for variant fields, integration now the existing field stuff into this is too heavy weight and not the purpose of this PR.
The problem is that a lot of this stuff here is index based, and the indices are fully typed, so you cannot easily share APIs. This should be refactored heavily but in a different PR.
// A field handle. The offset is the only used information when operating on a field | ||
#[derive(Clone, Debug)] | ||
pub(crate) struct FieldHandle { | ||
pub(crate) offset: usize, | ||
pub(crate) ty: Type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pub(crate) definition_struct_type: Arc<StructType>, | ||
} | ||
|
||
// A field instantiation. The offset is the only used information when operating on a field | ||
#[derive(Clone, Debug)] | ||
pub(crate) struct FieldInstantiation { | ||
pub(crate) offset: usize, | ||
pub(crate) uninstantiated_ty: Type, | ||
pub(crate) definition_struct_type: Arc<StructType>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
impl StructType { | ||
/// Get the fields from this struct type. If this is a proper struct, the `variant` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a good question what is better. If you continue to allow struct_def.fields()
(or field_count) you can easily make an error as the struct can be a variant. With this additional parameter, you have to think about whether you are dealing with a struct or with variants. The convention worked well in the compiler, and is continued here.
{ | ||
variants[variant as usize].1.as_slice() | ||
}, | ||
_ => &[], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't return an error, and we should not crash. I think this is OK.
I actually added things to the proptest_types modules, but I'm seeing now that might have not complete. Let me followup on this. Regardless, the problem remains to test bytecode verifier. Proptest does not provide a test oracle, so that is not the helpful (only in cases liike deserialize(serialize(x)) ~~ x it works well) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It also implements - representation in file format - serialization/deserialization of file format - bytecode verifier - code generation in compiler v2 - intepreter and paranoid mode The runtime semantics (intepreter, runtime types, compatibility checking), as well as certain other features (as marked by #13806 in the code), are not yet implemented by this PR. On Move bytecode level, there are `5 * 2` new instructions (each instruction has a dual generic version): ``` TestVariant(StructVariantHandleIndex) and TestVariantGeneric(StructVariantInstantiationIndex) PackVariant(StructVariantHandleIndex) and PackVariantGeneric(StructVariantInstantiationIndex) UnpackVariant(StructVariantHandleIndex) and UnpackVariantGeneric(StructVariantInstantiationIndex) ImmBorrowVariantField(VariantFieldHandleIndex) and ImmBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) MutBorrowVariantField(VariantFieldHandleIndex) and MutBorrowVariantFieldGeneric(VariantFieldInstantiationIndex) ``` For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data. There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier. Apart of passing existing tests, there is a new test in move-compiler-v2/tests/file-format-generator/struct_variants.move which shows the disassembeled output of various match expressions. There are also new e2e transactional tests in move-compiler-v2/transactional-tests/tests/enun. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected faults. See also #14074 and #13812.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
✅ Forge suite
|
✅ Forge suite
|
✅ Forge suite
|
Description
In #13725 support for enums was added to the stackless bytecode IR. This PR extends the Move VM's representation of bytecode ('file format') by struct variants. It implements
Compatibility checking, as well as certain other minor features (as marked by #13806 in the code), are not yet implemented by this PR.
On Move bytecode level, there are
5 * 2
new instructions (each instruction has a dual generic version):For the indices used in those instructions, 4 new tables have been added to the file format holding the associated data.
There is a lot of boilerplate code to support the new instructions and tables. Some refactoring of existing code has been done to avoid too much copy and paste, specifically in the serializers and in the bytecode verifier.
Type of Change
Which Components or Systems Does This Change Impact?
How Has This Been Tested?
Apart of passing existing tests, there is a new test in
move-compiler-v2/tests/file-format-generator/struct_variants.move
which shows the disassembeled output of various match expressions.There are also new e2e transactional tests in
move-compiler-v2/transactional-tests/tests/enun
. To add negative tests for the bytecode verifier and serializers, we first need a better way to build code with injected tests, see also #13871