Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GlobalTypeOptimization: Reorder fields in order to remove them #6820

Merged
merged 29 commits into from
Aug 12, 2024

Conversation

kripken
Copy link
Member

@kripken kripken commented Aug 7, 2024

Before, we only removed fields from the end of a struct. If we had, say

struct Foo {
  int x;
  int y;
  int z;
};

// Add no fields but inherit the parent's.
struct Bar : Foo {};

If y is only used in Bar, but never Foo, then we still kept it around, because
if we removed it from Foo we'd end up with Foo = {x, z}, Bar = {x, y, z} which
is invalid - Bar no longer extends Foo. But we can do this if we first reorder
the two:

struct Foo {
  int x;
  int z;
  int y; // now y is at the end
};

struct Bar : Foo {};

And the optimized form is

struct Foo {
  int x;
  int z;
};

struct Bar : Foo {
  int y; // now y is added in Bar
};

This lets us remove all fields possible in all cases AFAIK.

This situation is not super-common, as most fields are actually used both
up and down the hierarchy (if they are used at all), but testing on some
large real-world codebases, I see 10 fields removed in Java, 45 in Kotlin,
and 31 in Dart. This might lead to speedups if any of those objects are
constructed many times, but I'm not sure of that.

The NFC change to src/wasm-type-ordering.h was needed for this to
compile, and I can't understand why it was needed, but maybe I'm missing
something there?

@kripken kripken requested a review from tlively August 7, 2024 18:54
@kripken
Copy link
Member Author

kripken commented Aug 7, 2024

@jakobkummerow Is there any possible downside to Binaryen reordering WasmGC struct fields like this? In C/C++ that could be bad because alignment/padding issues in the C ABI could force larger struct sizes after reordering, but I hope wasm VMs don't have such issues?

@gkdn
Copy link
Contributor

gkdn commented Aug 7, 2024

@jakobkummerow Is there any possible downside to Binaryen reordering WasmGC struct fields like this? In C/C++ that could be bad because alignment/padding issues in the C ABI could force larger struct sizes after reordering, but I hope wasm VMs don't have such issues?

Jakob implemented some heuristic for packing fields in V8 earlier for optimizing memory (users of high level languages doesn't think about this when they declare members, it is handled at VM level).

@kripken
Copy link
Member Author

kripken commented Aug 7, 2024

@gkdn Ah, good to know. So J2CL doesn't optimize the order itself, that is, this PR would not be undoing some optimal order that you computed?

// Check for reads or writes in ourselves and our supers. If there are
// none, then operations only happen in our strict subtypes, and those
// subtypes can define the field there, and we don't need it here.
auto hasNoReadsOrWritesInSupers = !sub[i].hasRead && !sub[i].hasWrite;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little confusing that to look at what the supertypes do, we look at sub. Are there better names we could give things here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe that isn't clearest. The idea is that sub has data propagated to subtypes, so if there is nothing there, nothing from our supertypes was propagated to us. I renamed to dataFromSupers.

for (Index i = 0; i < superIndexes.size(); ++i) {
auto superIndex = superIndexes[i];
if (superIndex == RemovedField) {
if (!removableIndexes.count(i)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be slightly clearer to invert the if-else to avoid the negation in the condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

Comment on lines +297 to +299
// Update |next| to refer to the next available index. Due to
// possible reordering in the parent, we may not see indexes in
// order here, so just take the max at each point in time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this isn't just a matter of looking at the size of the supertype's field vector because we've left tombstone values in it? I wonder if the code would get any simpler if we recorded the new fields as a list of indices into the old fields rather than as a list of indices into the new fields that has to include tombstones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the data structure we need is "given this old index, find me the new index". That guides all the work we do in the updating stage for StructNew/StructGet/StructSet operations, all of which have old indexes and need new ones. So I think it would get more complex to flip this data structure.

@gkdn
Copy link
Contributor

gkdn commented Aug 7, 2024

@kripken Nope J2CL doesn't optimize it as it would be moot after optimizations.

Comment on lines +180 to +181
auto& dataFromSubsAndSupers = dataFromSubsAndSupersMap[type];
auto& dataFromSupers = dataFromSupersMap[type];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need the "dataFrom" prefix if you want to tighten this up. No strong feeling either way, though.

@jakobkummerow
Copy link
Contributor

Just to confirm: as @gkdn said, V8 chooses the physical offsets of struct fields to optimize packing density, so I have no size concerns here.

(If you're curious about the details, see here. Since objects on our heap are only 4-byte aligned, remembering one gap is enough.)

@kripken
Copy link
Member Author

kripken commented Aug 12, 2024

Thanks for the info @jakobkummerow , great, I'll land this.

@kripken kripken merged commit a4f9128 into WebAssembly:main Aug 12, 2024
13 checks passed
@kripken kripken deleted the gto.reorder branch August 12, 2024 19:30
Comment on lines +1036 to +1042
(type $A (sub (struct (field i32) (field i64) (field f32) (field f64) (field anyref) (field v128) (field nullref))))

;; CHECK: (type $C (sub $A (struct (field i64) (field v128) (field nullref) (field f64) (field anyref))))
(type $C (sub $A (struct (field i32) (field i64) (field f32) (field f64) (field anyref) (field v128) (field nullref))))

;; CHECK: (type $B (sub $A (struct (field i64) (field v128) (field nullref) (field f32) (field anyref))))
(type $B (sub $A (struct (field i32) (field i64) (field f32) (field f64) (field anyref) (field v128) (field nullref))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new test seems to have made the fuzzer uncover an existing bug: we don't validate the features used in heap types during validation. When the fuzzer uses this test as initial contents with SIMD disabled, it thinks the module validates, then it ends up generating SIMD instructions to access the v128 fields and failing validation later.

I can fix this as a follow-on to my planned updates to the type gather API, but in the short term, if the fuzzer failures are causing issues, a quick fix would be to change these fields from v128 to something else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed such an error as well. I believe the issue is that we don't validate the unused parts of rec groups, like here:

(module
 (rec
  (type $func (func))
  (type $unused (sub (struct (field v128))))
 )
 (func $func (type $func))
)

That v128 is not used in any used type, but the rec group pulls in that type, and then the fuzzer ends up doing something with it. Does that sound right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to validate the types at all in any central location, used or unused, although I supposed the used types should be validated at the location of their use. That kind of decentralized type validation seems error-prone, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed, we just validate in specific places. Another option could be to run collectHeapTypes and validate them all, but that wouldn't handle v128. Not sure I have a great idea here, but maybe just adding rec type traversal by itself, as I was suggesting, is not the best improvement. I'll open a PR to modify this test for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running collectHeapTypes is what I had in mind. Why wouldn't that handle v128?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean because v128 isn't a heap type. But I guess you meant that it would be found by traversing heap types + their rec groups? Yes, that can work.

I am hoping we can avoid another pass on the module in validation, though. That would make everything slower, unfortunately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I was thinking we would call getFeatures on each heap type (including unused members of used rec groups) and compare the results to the enabled features. This would catch the v128 field requiring SIMD.

kripken added a commit that referenced this pull request Aug 15, 2024
We don't properly validate that yet. E.g.:

(module
 (rec
  (type $func (func))
  (type $unused (sub (struct (field v128))))
 )
 (func $func (type $func))
)

That v128 is not used, but it ends up in the output because it is in a rec group that is used.
Atm we do not require that SIMD be enabled in such a case, which can trip up the fuzzer.

Context: #6820. For now, modify the test that uncovered this.
@gkdn gkdn mentioned this pull request Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants