diff --git a/doc/design/overview.md b/doc/design/overview.md index 2704205..f48d053 100644 --- a/doc/design/overview.md +++ b/doc/design/overview.md @@ -1,6 +1,8 @@ # Loaders Design -There are currently [three loader hooks](https://github.com/nodejs/node/tree/master/doc/api/esm.md#hooks): +There are currently the following [loader hooks](https://github.com/nodejs/node/blob/HEAD/doc/api/esm.md#hooks): + +## Basic hooks 1. `resolve`: Takes a specifier (the string after `from` in an `import` statement) and converts it into an URL to be loaded. @@ -11,7 +13,21 @@ There are currently [three loader hooks](https://github.com/nodejs/node/tree/mas * `json` (with `--experimental-json-modules`) * `wasm` (with `--experimental-wasm-modules`) -* `globalPreload`: Defines a string of JavaScript to be injected into the application global scope. +## Filesystem hooks + +The Node resolution algorithms may rely on various filesystem operations in order to return definite answers. For example, in order to know whether the package `foo` resolves to `/path/to/foo/index.js`, one must first check the [`exports` field](https://nodejs.org/api/packages.html#exports) located in `/path/to/foo/package.json`. Similarly, a loader that would add support for import maps need to know how to retrieve those import maps in the first place. + +While this is fairly easy when operating with the traditional filesystem (one could just use the `fs` module), things get trickier when you consider that loaders may also have to deal with other data sources. For instance, a loader that would import files directly from the network (similar to how Deno operates) would be unable to leverage `fs` to access the `package.json` content for the remote packages. Same thing when the package data are kept within archives that would require special support for access (like Electron or Yarn both operate). + +To facilitate such interactions between loaders, they are given the ability to override the basic filesystem operations used by the Node resolution helpers. This way, they can remain blissfully unaware of the underlying data source (filesystem or network or otherwise) and focus on the part of the resolution they care about. + +1. `statFile`: Takes the resolved URL and returns its [`fs.Stats` record](https://nodejs.org/api/fs.html#class-fsstats) (or `null` if it doesn't exist). + +1. `readFile`: Takes the resolved URL and returns its binary content (or `null` if it doesn't exist). + +## Advanced hooks + +1. `globalPreload`: Defines a string of JavaScript to be injected into the application global scope. ## Chaining diff --git a/doc/design/proposal-chaining-iterative.md b/doc/design/proposal-chaining-iterative.md index b3d75f3..87985e2 100644 --- a/doc/design/proposal-chaining-iterative.md +++ b/doc/design/proposal-chaining-iterative.md @@ -282,3 +282,53 @@ const babelOutputToFormat = new Map([ ]); ``` + +## Chaining `readFile` hooks + +Say you had a chain of three loaders: + +* `zip` adds a virtual filesystem layer for in-zip access +* `tgz` does the same but for tgz archives +* `https` allows querying packages through the network. + +Following the pattern of `--require`: + +```console +node \ + --loader zip \ + --loader tgz \ + --loader https +``` + +These would be called in the following sequence: + +(`zip` OR `defaultReadFile`) → `tgz` → `https` + +1. `defaultReadFile` / `zip` needs to be first to know whether the file exists on the actual filesystem, which is fed to the subsequent loader +1. `tgz` receives the raw source from the previous loader and, if necessary, checks for the file existence via its own rules +1. `https` does the same thing + +ReadFile hooks would have the following signature: + +```ts +export async function readFile( + url: string, // A URL pointing to a location; whether the file + // exists or not isn't guaranteed + interimResult: { // result from the previous hook + data: string | ArrayBuffer | TypedArray | null, // The content of the + // file, or `null` if it doesn't exist. + }, + context: { + conditions = string[], // Export conditions of the relevant package.json + }, + defaultReadFile: function, // Node's default load hook +): { + signals?: { // Signals from this hook to the ESMLoader + contextOverride?: object, // A new `context` argument for the next hook + interimIgnored?: true, // interimResult was intentionally ignored + shortCircuit?: true, // `resolve` chain should be terminated + }, + data: string | ArrayBuffer | TypedArray | null, // The content of the + // file, or `null` if it doesn't exist. +} { +``` diff --git a/doc/design/proposal-chaining-middleware.md b/doc/design/proposal-chaining-middleware.md index 41a481c..8839b58 100644 --- a/doc/design/proposal-chaining-middleware.md +++ b/doc/design/proposal-chaining-middleware.md @@ -263,3 +263,42 @@ export async function load( } ``` + +## Chaining `readFile` hooks + +Say you had a chain of three loaders: + +* `zip` adds a virtual filesystem layer for in-zip access +* `tgz` does the same but for tgz archives +* `https` allows querying packages through the network + +Following the pattern of `--require`: + +```console +node \ + --loader zip \ + --loader tgz \ + --loader https +``` + +These would be called in the following sequence: `zip` calls `tgz`, which calls `https`. Or in JavaScript terms, `zip(tgz(https(input)))`: + +ReadFile hooks would have the following signature: + +```ts +export async function readFile( + url: string, // A URL pointing to a location; whether the file + // exists or not isn't guaranteed + context: { + conditions = string[], // Export conditions of the relevant `package.json` + }, + next: function, // The subsequent `readFile` hook in the chain, + // or Node’s default `readFile` hook after the + // last user-supplied `readFile` hook +): { + data: string | ArrayBuffer | TypedArray | null, // The content of the + // file, or `null` if it doesn't exist. + shortCircuit?: true, // A signal that this hook intends to terminate + // the chain of `load` hooks +} { +```