Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New ts_project rule should have a worker and honor ts --incremental #1726

Closed
alexeagle opened this issue Mar 24, 2020 · 7 comments
Closed

New ts_project rule should have a worker and honor ts --incremental #1726

alexeagle opened this issue Mar 24, 2020 · 7 comments

Comments

@alexeagle
Copy link
Collaborator

TypeScript has a mechanism to avoid unneeded work in re-compiles, the --incremental flag and .tsbuildinfo file.
We should figure out a way to use these under Bazel to speed up compiles of ts_project rules.

@pauldraper
Copy link

pauldraper commented May 19, 2020

Stateful compilers have been discussed with Bazel many times: https://groups.google.com/forum/#!topic/bazel-discuss/3iUy5jxS3S0

As with workers, they run contrary to the reproducibility safeguards of Bazel, but are necessary for high performance for some compilers.

For higherkindness/rules_scala, we implemented stateful compilation by having the worker write the state to disk, cached by the target name. (Though fearing non-reproducibility, we made it be opt-in: --worker_extra_flag=ScalaCompile=--persistence_dir=.bazel-zinc).

AFAIK Swift also has incremental compilation.

@alexeagle
Copy link
Collaborator Author

I did some research on this yesterday after chatting with @mrmeku about some options.

  1. Linker vs. worker
    We had some previous discussion about a fundamental problem while adding a worker mode to rollup_bundle: the linker runs once per node process, but each WorkRequest might depend on a new library that needs to be linked. We could try to add logic inside the node program somewhere to detect this situation and repair it by linking new modules on the fly. However it has lots of bad design edges, you easily end up with a worker that sees more dependencies than it should (eg. you delete a first_party module yet type-checking of usages of it still works). A different approach for this is more conservative, so right now I prefer it: each worker is tied to a certain set of deps, and if you need more deps to be linked, then you need a new worker pool.

In practice that seems easy enough to achieve, and it closely models what you would do with plain tsc outside of Bazel. (It's also similar to the Asana https://github.com/Asana/bazeltsc approach as I understand it)

  • We assume you have one or few tsconfig.json files, maybe using TS composite project references
  • Each tsconfig.json has a separate tsc --watch process

Our basic model would be that a ts_project macro generates a nodejs_binary specifically for this project, something like

nodejs_binary(
    name = "tsc_watch",
    data = [
        "//packages/worker",
        "@npm//typescript",
        "@npm//protobufjs",
        "worker_adapter.js",
        "tsconfig.json",
    ],
    entry_point = "@npm//:node_modules/typescript/bin/tsc",
    templated_args = [
        "--node_options=--require=$$(rlocation $(rootpath worker_adapter.js))",
        "--nobazel_patch_module_resolver",
        "--project",
        "$(execpath tsconfig.json)",
         "--outDir",
        "bazel-out/darwin-fastbuild/bin/%s" % package_name(),
        "--watch",
    ],
)

and it then calls ts_project(tsc = "tsc_watch", supports_workers = True)

  1. how does the tsc program know what to do?
    There are two options here. In the snippet above I assume there's a --require script which runs before the first line of tsc executes, and it can patch up some stuff. For example it would need to fix up the argv with sth like
const workerArg = process.argv.indexOf('--persistent_worker')
if (workerArg > 0) {
    process.argv.splice(workerArg, 1)
}

it also needs to prevent writes to stdout, and read stdin to find protos and respond to them in some way, probably using the @bazel/worker helper. But we could let tsc --watch do its usual filesystem watching, so it would do compiles whenever it sees an input file change, regardless of WorkRequest protos coming in. This is nice because we can just run tsc the same way you do outside of Bazel, and can generalize to other programs so we don't revisit this topic for every tool that has watch mode. But it needs more research to see if it's really feasible.

The second option is like tsc_wrapped or https://github.com/Asana/bazeltsc/blob/master/src/bazeltsc.ts - we can write a custom TS compiler for use in watch mode. We have lots of experience doing this and it's not very hard, but it does mean we have a constrained peerDep on typescript, and will expose us to some extra surface of bugs that you might see under Bazel and can't reproduce with vanilla tsc - and that goes against a design principle of the ts_project rule.

  1. configuring Bazel

I think we'd need users to be aware that every ts_project rule comes with its own persistent process and resource (RAM, file descriptors) consumption. You probably don't want hundreds of these rules in a build. We need to make sure each ts_project really does have its own worker - IIRC this means we need its mnemonic to include the ts_project target somehow encoded into it. Also users need to configure the worker pool to be size one - we never want more than one worker for a given ts_project.(With concurrent worker support coming, this is probably the right setting for most workers anyway). However if the worker pool size is configured per-mnemonic, then our use of per-project mnemonics means this is really hard to configure and keep up-to-date.

  1. Making tsc happy

Right now my understanding of .tsbuildinfo isn't sufficient to feel really confident, but my sense is like @pauldraper says above - we let tsc write the .tsbuildinfo somewhere that it can find on the next execution of that same project, so that the built-in incrementality model still works. I don't remember if tsc also wants to read the .tsbuildinfo from dependent targets, but that's easy enough to pass along on a Provider.

I have to set this down again, don't have time to dedicate to it this week. Here are some bits I was playing with to understand it better: master...alexeagle:ts_project_worker

@alexeagle
Copy link
Collaborator Author

Some more research and experimentation tonight.

My original hope of just reusing tsc --watch seems like it can't work, because Bazel needs us to report back on stdout every time we do a compile, and if we only hook as a --require script, we can't really get the compile finished event to write back. So the compilation action just hangs.

Looking more at the minimal TS program we'd have to write as a wrapper, I see that TS introduced a public API for writing such a watcher, so it's better than when we first implemented tsc_wrapped at Google. See https://github.com/microsoft/TypeScript/blob/v3.9.7/src/compiler/watchPublic.ts
I think we can implement interface WatchFactory to make it depend only on reading stdin.
It won't be super-trivial though. For example I think we need to use the value of emitDeclarationOnly to choose between
ts.createEmitAndSemanticDiagnosticsBuilderProgram and ts.createSemanticDiagnosticsBuilderProgram - but do we also need to consult --noEmit to get this right? Will be really interesting to find just the minimal glue to put these parts together.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs in two weeks. Collaborators can add a "cleanup" or "need: discussion" label to keep it open indefinitely. Thanks for your contributions to rules_nodejs!

@github-actions github-actions bot added the Can Close? We will close this in 30 days if there is no further activity label Oct 17, 2020
@alexeagle alexeagle removed the Can Close? We will close this in 30 days if there is no further activity label Oct 17, 2020
@tomasdev
Copy link

Any update?

@tony-scio
Copy link
Contributor

The features on the 3.0.0 release implies this is fixed. But I'm not certain I'm understanding everything it entails.

typescript: worker mode for ts_project (#2136) (5d70997)

@alexeagle
Copy link
Collaborator Author

Yeah it's documented here
https://bazelbuild.github.io/rules_nodejs/TypeScript.html#ts_project-supports_workers
as experimental. Sorry I didn't update this issue sooner!

We still need feedback to help promote it to a stable API but should be in separate issue(s)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants