-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] reduce circular dependencies in ESM loader and move more essential modules into the snapshot #45828
base: main
Are you sure you want to change the base?
Conversation
Review requested:
|
9646e62
to
7b72820
Compare
This comment was marked as outdated.
This comment was marked as outdated.
7b72820
to
4959b49
Compare
One of them has been stalled for 3 months and another for 5 months. I am not entirely sure how close they are to landing, but I don't see why this PR has to wait for others? We can just rebase on top of each other, it's mostly moving code around, so some copy-pasting and fixups would be enough. |
If anything I think #44710 should be rebased on top this and be modified to stop introducing more circular dependencies, adding top-level access to |
It’s a draft and will be cleaned up before it’s marked as ready for review. There’s a fair bit of debugging code in there that isn’t intended to land. It’s also undergoing very active development in the last few weeks; I wouldn’t describe it as stalled. Separately, I personally find it much easier to work with many smaller files than a few big ones; is there some way to preserve the isolation of small files without harming the ability to snapshot? |
If the small files don't do any circular dependency, then maybe, but that's difficult for |
Perhaps we have different definitions of what "stalled" means, I learned about the inspector bug two months ago, and it seems the PR is still blocked by that bug (albeit in a different form), so my understanding is that it has been stalled by that bug. This refactoring is just reduces the technical debt (circular dependencies, TDZs, synchronous side-effect/runtime state queries at module loading time etc.) in the code case. The sooner it lands, the less likely that future code have to pile more technical debt onto the codebase (unless they repeat the refactoring themselves). #43772 is fine in that regard, as it is not adding any more of those, so it would not be difficult for that PR to rebase onto this or the other way around. But the current code in #44710 is still doing those things, I'd argue that code that does those stuff really should not land in the code base in the first place. We should not count on future refactoring to pay off the technical debt, or we still end up with the minefield we have now in Anyway if you insist that this has to land after other ESM PRs, I can just open a separate PR that only speeds up the rest of the bootstrap process and let the ESM loader remain unsnapshotable, keeping its web of circular dependencies and TDZs. Someone might need to do the refactoring again if they want to make the ESM loader snapshotable, or just do some general quality control there, but I am not sure if I have the energy to change the wheels of a flying plane that refuses to stop, so to speak. |
We’ve resolved that bug. We’re currently trying to figure out how to get data from the worker to the main thread, and the error involves
On our to-do list is to provide more of the pieces of ESM loading as utility functions, like getting the package metadata and determining what to choose from a package As for #44710, my goal is just to avoid slowing down @JakobJingleheimer as that PR has been a huge effort for a long time and I don’t want him impeded. If @JakobJingleheimer isn’t bothered by the merge conflicts that would follow from merging this PR first, that’s fine with me. |
So that means there's not a fix ready for review yet, right?
It could be, because fs also depends on vm (Windows-only, yes it's that bad), so you need to be careful about vm too.
I think the same thing can always said about some part of the code base. We can't just let the code continue to be a mess simply because there is a PR that has been worked for a long time. I had many PRs that I worked on for almost a year until I got the issues solved and got them landed. I wouldn't stop other people from refactoring the code base just because there would be a merge conflict in my PR that was still running into issues that I was trying to solve. If those refactoring improves the determinism in the code, then rebasing onto that refactoring would also make my PR more deterministic, which is a good thing. Blocking such refactoring because I wanted to land more non-deterministic code first would not really make sense to me. |
You wrote in the top post that you intend to start over and redo this work as a series of smaller commits. All I meant was that you might want to coordinate with @JakobJingleheimer to try to minimize effort on both of your parts. If you need to start over anyway, you might as well wait until his work lands first, if it can land in a reasonable amount of time. Maybe set a deadline when you plan to start over, and if he can land his PR before then great, otherwise you’ll do yours and he’ll have to resolve conflicts. I’m not trying to debate stalled versus in progress/order of landing in general, just that in this particular case he’s been working on that other branch most days for the last few weeks and you should try to coordinate. |
Sorry just getting caught up here. TLDR: Go ahead with this PR. I reviewed it previously, and I think it won't impact 44710 much. If you can land this PR very soon, that would be ideal (see below). In order to avoid the serialisation issue in 44710, I need to make a somewhat significant pivot in approach (to avoid transferring One thing though: I needed to access |
@joyeecheung any chance at least the part of this PR that touches ESM could be merged by tomorrow evening CET? I'll be starting a fresh branch then, so if this is already in, then no merge conflicts. |
No, because this repository does not land any non-trivial PR in less than 48 hours. But anyway, I will just open another pull request that only includes minimal changes to the ESM part and makes the rest of the bootstrap snapshotable. That means the most part of the ESM loader would remain unsnapshotable, but that is unlikely to interfere with the feature work. I'll let this one stall until I find the energy to make the ESM loader snapshotable again. |
I'd be happy to help with the ESM part after 44710 (the caveat being i'll need some direction as I'm not familiar with actually facilitating snapshots). |
I’m interested in getting this landed too. It looks like the current PR can’t land as is because of test failures? Any ideas regarding those? Is there a guide to “how to make code snapshottable” anywhere? Your detailed list in the top post seems pretty exhaustive, if I were to attempt to redo what you did here, but if there were other resources elsewhere those would be useful too. For example I’m wondering what the issue is with circular references, for example. |
This is not yet ready for review, unless you enjoy reading a huge diff that's mostly just moving code around. I am opening this to get some numbers from the CI and see if it needs fixups in niche build configs. My plan is to try to split it into more logical commits to make it easier to review/backport - this is meant to be a refactoring that does not change any behavior, if for some reason something subtle gets broken (e.g. the race caused by a V8 bug that needs to be worked around in test/sequential/test-inspector-break-when-eval.js, which probably can't be reproduced in real-world debug sessions). At least it's easier to trace that down with smaller commits. But if for some reason the patch just can't be broken into smaller commits, I'd send this one for review instead.
Locally the test passes and I get a pretty good boost in the startup benchmarks:
A brief summary of what's been done here:
is_main_thread.js
andbrowser.js
(worker thread currently only has a very basic built-iin snapshot without modules, we can merge the list when the worker thread also has modules in its snapshot)lib/internal/modules/esm/
as much as possible by merging most ESM-only utilities (esm/assert.js
,esm/create_dynamic_module.js
,esm/formats.js
,esm/get_format.js
,esm/handle_process_exit.js
,esm/initialize_import_meta.js
,esm/module_job.js
,esm/module_map.js
,esm/package_config.js
) into onelib/internal/modules/esm/utils.js
and make mostlib/internal/modules/esm/
inter-dependencies lazy. I don't think splitting the ESM loader into small modules really yields much benefit when that results in landmines of circular dependencies and TDZs that we need to watch out for. Fewer modules also means lower startup costs.esm/translators.js
is merged intoesm/loader.js
since the translators aren't used anywhere elsemodules/cjs/helpers
is nowmodules/helpers
because it's shared by two loaders, and some utils used by the both loaders are moved there as well.ESMLoader
constructor to construct itself (duh), while the default implementation ofESMLoader
depends on the cascaded ESM loader too (why it doesn't usethis
, I don't know, but it's not the time to break that). Meanwhile almost all module parts depend on the cascaded ESM loader (actually in a pretty dangerous way, because many of them have been getting the loader synchronously even though the loader hooks are loaded asynchronously so it's non-deterministic whether they will get a loader with hooks or not, but that's a pretty old problem that this refactoring PR does not plan to solve), so the cascaded loader still has it's own file that lazily loads the constructor in order to reduce the severity of the circular dependency.getOptionValue
invocations and top-level access toprocess.env
inlib/internal/modules/esm/
by making those accesses lazy. This also makeslib/internal/modules/esm/utils.js
snapshottable.After applying this patch, loading a user-land CJS module alone does not incur compilation of any additional modules. If the builtins loaded by that module is already in the snapshot (e.g.
fs
,url
,timers
,util
,path
..modules that are intest-bootstrap-modules.js
), there's no additional cost of compilation either. That explains the benchmark numbers because compilation is still the bottleneck of startup. Some ESM loader parts still need to be compiled for a user-land ESM module, depending on what that module actually does, though a big chunk of ESM loader is now in theutils.js
which is also snapshotted (before this patch none of the ESM loader is in the snapshot because without the refactoring it was not snapshottable).