Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assembler: enable vendoring of compiled libraries (fixes #1435) #1643

Merged
merged 6 commits into from
Feb 10, 2025

Conversation

paracetamolo
Copy link
Contributor

@paracetamolo paracetamolo commented Jan 27, 2025

Before this MR, the only ways to link a library during assembly are:

  • Add the module sources. The library is effectively recompiled as part of the new library/program
  • Add a compiled library. In this case all references to this library are compiled as external nodes

This MR adds the possibility to link a compiled library during assembly but have its MAST forest be merged in the resulting library/program. Two desirable properties:

  • No external nodes are added for procedures that are present in vendored libraries, which improves performance.
  • No unused procedures are present in the resulting program/library. If the vendored library contains extra unused procedures they are removed during assembly.

@paracetamolo paracetamolo self-assigned this Jan 27, 2025
@paracetamolo paracetamolo changed the title Marco vendoring Assembler: enable vendoring of compiled libraries (fixes #1435) Jan 27, 2025
@paracetamolo paracetamolo added the assembly Related to Miden assembly label Jan 27, 2025
@bobbinth bobbinth requested review from plafer and bitwalker January 28, 2025 06:13
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you. Not a full review yet, but I left a few comments inline - one of them describing a potential alternative approach.

assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
@paracetamolo paracetamolo force-pushed the marco-vendoring branch 3 times, most recently from f73998a to 806a91e Compare January 30, 2025 14:03
@paracetamolo
Copy link
Contributor Author

I pushed a new version that uses the approach mentioned by @bobbinth in the comment above. At assembly time, we merge all the vendored libraries collected into a single MAST forest that is passed to the builder. On a call to ensure_external the builder first checks into the vendored mast if the procedure is present and it that case it will copy the its subtree.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! Not a full review, but I left some comments inline.

Overall, it feels like this approach should work better than the previous one.

assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
core/src/mast/mod.rs Outdated Show resolved Hide resolved
@paracetamolo
Copy link
Contributor Author

Are there more interesting tests that we could run? Right now there is a single test checking that a used procedure is inlined and an unused one is deleted.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you. I reviewed pretty much all non-test and on-CLI code and left some comments inline.

assembly/src/assembler/mast_forest_builder.rs Show resolved Hide resolved
core/src/mast/mod.rs Outdated Show resolved Hide resolved
core/src/mast/mod.rs Outdated Show resolved Hide resolved
core/src/mast/node/mod.rs Outdated Show resolved Hide resolved
core/src/mast/node/mod.rs Outdated Show resolved Hide resolved
core/src/mast/mod.rs Outdated Show resolved Hide resolved
core/src/mast/mod.rs Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Show resolved Hide resolved
Comment on lines 480 to 494
for old_id in RootIterator::new(&root_id, &self.vendored_mast.clone()) {
let mut node = self.vendored_mast[old_id].clone();
node.remap(&self.vendored_remapping);
let new_id = self.ensure_node(node)?;
self.vendored_remapping.insert(old_id, new_id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will copy the node + its decorators. But where do we copy the advice map data from the vendored libraries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a merge_advice_map step that it's triggered at every call of vendor_or_ensure_external.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be pretty expensive to do this on every call to vendor_or_ensure_external()? I there reason not to do this at the very end (e.g., in MastForestBuilder::build())?

Copy link
Contributor Author

@paracetamolo paracetamolo Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the other option is to do it in the constructor new or at the end in build. In that case we risk copying the advice map even if we didn't vendor anything. I think both approaches are ok. If you used new with vendored libraries you probably are going to vendor something, so we could copy the advice map in there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the current approach - but let's create an issue to improve on it. A simple short-term improvement could be to keep track if at least one procedure was vendored. A more long-term approach would be to somehow track of which key-value pairs from the advice map are required but the set of vendored procedures.

@bobbinth
Copy link
Contributor

bobbinth commented Feb 1, 2025

@plafer, @bitwalker - could you also take a look at this PR?

assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Show resolved Hide resolved
core/src/mast/mod.rs Show resolved Hide resolved
core/src/mast/mod.rs Outdated Show resolved Hide resolved
Comment on lines 481 to 493
let mut node = self.vendored_mast[old_id].clone();
node.remap(&self.vendored_remapping);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutably remapping a MastNode sounds dangerous to me. Instead of cloning + mutating the copy, would it not be better to make remap methods be fn remap(&self, &Remapping) -> MastNode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's dangerous, Rust tracks mutability very explicitly. With a mutable remapping you have the choice to use additional memory by cloning+mutating like above or you can directly mutate if you are not using the old node. If the function implicitly returns a new Node then you are always forced to use additional memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one question I had here is whether we are doing a lot of extra work with the current approach. Specifically, I think remap() recursively updates all the children, but then we iterate through all the children here as well. Or am I misunderstanding how it works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's not recursive, remap will simply change the ids of the children of the current node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's dangerous, Rust tracks mutability very explicitly.

I meant "dangerous" more in the sense of "bug-prone" - e.g. if you remove the clone() on line 481, everything still compiles, but you're mutating the wrong MastNode (and it's relatively easy to miss).

If the function implicitly returns a new Node then you are always forced to use additional memory.

Not necessarily, the value returned could be used to mutate a node in an existing MastForest, e.g.

let node = self.vendored_mast[old_id].remap(...);
self.vendored_mast[old_id] = node;

and I would expect compiler optimizations to compile this down to an in-place mutation of self.vendored_mast[old_id].

So then it comes down to a question of which version is less bug-prone, which I believe my suggestion is? Plus, I don't expect we'll ever want to mutate a node in-place, since this implies we're modifying a MastForest in-place (by e.g. deleting a node and needing to remap virtually all nodes), which is very bug-prone and we'll likely never want to do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. if you remove the clone() on line 481, everything still compiles, but you're mutating the wrong MastNode (and it's relatively easy to miss).

I agree that's very easy to miss, but you can't remove the clone w/o having a compiler error. Unless I'm missing something.

I don't expect we'll ever want to mutate a node in-place,

I agree.

I don't mind changing it to return a new node, I'm still trying to understand where to use different styles (functional vs mutable).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't remove the clone w/o having a compiler error. Unless I'm missing something.

Right, I was being imprecise. I meant basically writing self.vendored_mast[old_id].remap(...) is allowed - but then in this specific case the next call to self.ensure_node() then fails to compile before the MastForest is behind an Arc.

Ultimately this was more a minor hunch for me more than anything else, and I'm okay with leaving it as is.

core/src/mast/node/mod.rs Outdated Show resolved Hide resolved
core/src/mast/node/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
@paracetamolo
Copy link
Contributor Author

I'd like to improve the tests before merging. Working on it now.

@paracetamolo paracetamolo force-pushed the marco-vendoring branch 2 times, most recently from 63358ce to 52ea6a6 Compare February 6, 2025 10:31
@paracetamolo paracetamolo marked this pull request as ready for review February 6, 2025 10:31
Copy link
Contributor

@plafer plafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Also, we could create an issue for adding a test that checks that the advice provider is correctly vendored as well.

Comment on lines 87 to 88
let advice_map = mast_forest.advice_map_mut();
*advice_map = vendored_mast.advice_map().clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would write this on one line for better readability

*mast_forest.advice_map_mut() = vendored_mast.advice_map().clone();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@paracetamolo
Copy link
Contributor Author

Issue for the adviceMap test #1655

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left some comments line (mostly minor nits). Once these are addressed, we can merge.

core/src/mast/mod.rs Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
miden/src/cli/run.rs Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Thank you!

@bobbinth bobbinth merged commit 4ebf8b5 into next Feb 10, 2025
9 checks passed
@bobbinth bobbinth deleted the marco-vendoring branch February 10, 2025 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assembly Related to Miden assembly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants