Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache auto-save 3/3: Expose watcher.unstable_autoSaveCache, default enabled #1434

Closed
wants to merge 4 commits into from

Conversation

robhogan
Copy link
Contributor

@robhogan robhogan commented Feb 3, 2025

Pull Request resolved: #1434

Configures a new mechanism for the file map cache to re-save after changes made to the file system while Metro is running.

Currently, If you make a change while Metro is running, Metro will re-process (hash) that file and update an in-memory record, but won't save it. The next time Metro starts up, it will process that file again, and write the cache at the end of startup.

In most cases this provides a modest reduction in startup work/time - proportional to the number of changes in the previous Metro session.

However, it unlocks the potential of doing more processing lazily - ie, we don't need to hash everything up front if we can hash only what we need for a bundle, and then save those hashes in a cache.

We'll use this to implement lazy hashing.

Implementation

The auto save mechanism is contained within DiskCacheManager via generic listening APIs. When the file map emits a change event to Metro, we start/restart a configurable debounce timer, and save the cache asynchronously.

The 5 second default is chosen so that we don't try to save while Metro may be busy - eg, a change is likely to be followed by an HMR exchange over the next second or two.

Changelog:

 - **[Experimental]**: Auto-save file cache via `config.watcher.unstable_autoSaveCache`

Differential Revision: D69024198

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 3, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69024198

facebook-github-bot pushed a commit that referenced this pull request Feb 3, 2025
…nabled (#1434)

Summary: Pull Request resolved: #1434

Differential Revision: D69024198
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69024198

facebook-github-bot pushed a commit that referenced this pull request Feb 3, 2025
…nabled (#1434)

Summary: Pull Request resolved: #1434

Differential Revision: D69024198
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69024198

facebook-github-bot pushed a commit that referenced this pull request Feb 3, 2025
…nabled (#1434)

Summary:


Changelog:
```
 - **[Experimental]**: Auto-save file cache via `config.watcher.unstable_autoSaveCache`
```

Differential Revision: D69024198
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69024198

Summary:
Change the internal `CacheManager` API to allow cache managers to lazily request data snapshots, rather than being provided with them directly.

This is - 
 - Safe, since previous work to ensure cacheable data remains synchronously internally consistent (eg D67825634, D67762435)
 - More efficient for cache managers that have no-op `write()`
 - Motivated by the coming auto-save feature, where we'll explicitly allow subsequent calls to `getSnapshot()` to get the latest state after changes.

Also, replace `CacheDelta` with `CacheManagerWriteOptions` containing `changedSinceCacheRead`. In practice, consumers only check `delta.size > 0`, so providing the full delta is unnecessarily leaky, potentially restrictive in future, and a bit cumbersome to mock, etc.

Changelog: Internal

Reviewed By: huntie

Differential Revision: D68960958
…rs, call end()

Summary:
Add an `end()` method to `CacheManager`s, consistent with various other Metro classes, to allow (and require) it to clean up any internal resources (timers, I/O operations) during the parent `FileMap.end()`.

Pass two new arguments to `CacheManager.write()`:
 - `eventSource` - an interface that may be used to (un)subscribe to change events that *may* (for cache managers choosing to implement this feature) re-save the cache on changes. We'll use this in the next diff to implement auto-save.
 - `onWriteError` to report errors triggered within listeners to `eventSource`.

Changelog: Internal

Reviewed By: huntie

Differential Revision: D69005523
Summary:
Implement auto-saving the file map cache `debounceMs` after a file `'change'` event, so that the on-disk cache reflects a recent state even if there have been many changes since Metro last started.

This avoids Metro processing changes once during a session and then again at the start of the next session - optimising next session startup.

With this, we can implement lazy file processing on startup (potentially *much* faster startup), hashing files only as we need to do build a bundle, and the next (warm) Metro session will immediately have all of those hashes available to it so that we still get optimal warm builds.

Differential Revision: D69006204
…nabled (#1434)

Summary:

Configures a new mechanism for the file map cache to re-save after changes made to the file system while Metro is running.

Currently, If you make a change while Metro is running, Metro will re-process (hash) that file and update an in-memory record, but won't save it. The next time Metro starts up, it will process that file again, and write the cache at the end of startup.

In most cases this provides a modest reduction in startup work/time - proportional to the number of changes in the previous Metro session.

However, it unlocks the potential of doing more processing lazily - ie, we don't need to hash everything up front if we can hash only what we need for a bundle, and then save those hashes in a cache.

We'll use this to implement lazy hashing.

## Implementation
The auto save mechanism is contained within `DiskCacheManager` via generic listening APIs. When the file map emits a `change` event to Metro, we start/restart a configurable debounce timer, and save the cache asynchronously.

The 5 second default is chosen so that we don't try to save while Metro may be busy - eg, a change is likely to be followed by an HMR exchange over the next second or two.

Changelog:
```
 - **[Experimental]**: Auto-save file cache via `config.watcher.unstable_autoSaveCache`
```

Differential Revision: D69024198
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69024198

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 0d39866.

@robhogan robhogan deleted the export-D69024198 branch February 4, 2025 14:31
robhogan added a commit that referenced this pull request Feb 10, 2025
)

Summary:
## Stack
In this stack we're moving towards metro-file-map being able to *lazily* compute file metadata - in particular the SHA1 hash - only when required by the transformer.

More context in #1325 (comment)

## Implementing config `watcher.unstable_lazySha1`
This diff introduces a new opt-in config that
 - Disables eager computation of `sha1` for all watched files.
 - Adds support in `Transformer` to accept a callback that asynchronously returns SHA1, and optionally file content.
 - Maintains support for the old sync API, for anyone using `Transformer` directly. This will likely be dropped in a coming major.

Along with the already landed, default-on [auto-saving cache](#1434), this should provide order of magnitude[1] faster startup on large projects, with no compromise to warm build perf, and very little slowdown in cold builds in most cases[2].

[1] Metro needs to watch file subtrees, but typically only a small proportion of those files are used in a build. By hashing up front, we can spend up to several minutes hashing files that will never be used.

[2] Cold file caches with warm transform caches - typically only when using a remote cache - may be observably slower due to the need to read and hash a file that wouldn't otherwise need to be read, though this still only moves the cost from startup to build. For truly cold builds, this change adds SHA1 computation time to transform time, but requires no additional IO. SHA1 computation is typically much faster than Babel transformation, and we might consider faster algorithms in future (SHA1 is Eden-native).

Pull Request resolved: #1435

Changelog:
```
 - **[Experimental]**: Add `watcher.unstable_lazySha1` to defer SHA1 calculation until files are needed by the transformer

Differential Revision: D69373618
facebook-github-bot pushed a commit that referenced this pull request Feb 15, 2025
)

Summary:
## Stack
In this stack we're moving towards metro-file-map being able to *lazily* compute file metadata - in particular the SHA1 hash - only when required by the transformer.

More context in #1325 (comment)

## Implementing config `watcher.unstable_lazySha1`
This diff introduces a new opt-in config that
 - Disables eager computation of `sha1` for all watched files.
 - Adds support in `Transformer` to accept a callback that asynchronously returns SHA1, and optionally file content.
 - Maintains support for the old sync API, for anyone using `Transformer` directly. This will likely be dropped in a coming major.

Along with the already landed, default-on [auto-saving cache](#1434), this should provide order of magnitude[1] faster startup on large projects, with no compromise to warm build perf, and very little slowdown in cold builds in most cases[2].

[1] Metro needs to watch file subtrees, but typically only a small proportion of those files are used in a build. By hashing up front, we can spend up to several minutes hashing files that will never be used.

[2] Cold file caches with warm transform caches - typically only when using a remote cache - may be observably slower due to the need to read and hash a file that wouldn't otherwise need to be read, though this still only moves the cost from startup to build. For truly cold builds, this change adds SHA1 computation time to transform time, but requires no additional IO. SHA1 computation is typically much faster than Babel transformation, and we might consider faster algorithms in future (SHA1 is Eden-native).


Changelog:
```
 - **[Experimental]**: Add `watcher.unstable_lazySha1` to defer SHA1 calculation until files are needed by the transformer

Differential Revision: D69373618
robhogan added a commit that referenced this pull request Feb 15, 2025
)

Summary:
## Stack
In this stack we're moving towards metro-file-map being able to *lazily* compute file metadata - in particular the SHA1 hash - only when required by the transformer.

More context in #1325 (comment)

## Implementing config `watcher.unstable_lazySha1`
This diff introduces a new opt-in config that
 - Disables eager computation of `sha1` for all watched files.
 - Adds support in `Transformer` to accept a callback that asynchronously returns SHA1, and optionally file content.
 - Maintains support for the old sync API, for anyone using `Transformer` directly. This will likely be dropped in a coming major.

Along with the already landed, default-on [auto-saving cache](#1434), this should provide order of magnitude[1] faster startup on large projects, with no compromise to warm build perf, and very little slowdown in cold builds in most cases[2].

[1] Metro needs to watch file subtrees, but typically only a small proportion of those files are used in a build. By hashing up front, we can spend up to several minutes hashing files that will never be used.

[2] Cold file caches with warm transform caches - typically only when using a remote cache - may be observably slower due to the need to read and hash a file that wouldn't otherwise need to be read, though this still only moves the cost from startup to build. For truly cold builds, this change adds SHA1 computation time to transform time, but requires no additional IO. SHA1 computation is typically much faster than Babel transformation, and we might consider faster algorithms in future (SHA1 is Eden-native).

Pull Request resolved: #1435

Changelog:
```
 - **[Experimental]**: Add `watcher.unstable_lazySha1` to defer SHA1 calculation until files are needed by the transformer

Differential Revision: D69373618
robhogan added a commit that referenced this pull request Feb 16, 2025
)

Summary:
## Stack
In this stack we're moving towards metro-file-map being able to *lazily* compute file metadata - in particular the SHA1 hash - only when required by the transformer.

More context in #1325 (comment)

## Implementing config `watcher.unstable_lazySha1`
This diff introduces a new opt-in config that
 - Disables eager computation of `sha1` for all watched files.
 - Adds support in `Transformer` to accept a callback that asynchronously returns SHA1, and optionally file content.
 - Maintains support for the old sync API, for anyone using `Transformer` directly. This will likely be dropped in a coming major.

Along with the already landed, default-on [auto-saving cache](#1434), this should provide order of magnitude[1] faster startup on large projects, with no compromise to warm build perf, and very little slowdown in cold builds in most cases[2].

[1] Metro needs to watch file subtrees, but typically only a small proportion of those files are used in a build. By hashing up front, we can spend up to several minutes hashing files that will never be used.

[2] Cold file caches with warm transform caches - typically only when using a remote cache - may be observably slower due to the need to read and hash a file that wouldn't otherwise need to be read, though this still only moves the cost from startup to build. For truly cold builds, this change adds SHA1 computation time to transform time, but requires no additional IO. SHA1 computation is typically much faster than Babel transformation, and we might consider faster algorithms in future (SHA1 is Eden-native).

Pull Request resolved: #1435

Changelog:
```
 - **[Experimental]**: Add `watcher.unstable_lazySha1` to defer SHA1 calculation until files are needed by the transformer

Differential Revision: D69373618
facebook-github-bot pushed a commit that referenced this pull request Feb 16, 2025
)

Summary:
## Stack
In this stack we're moving towards metro-file-map being able to *lazily* compute file metadata - in particular the SHA1 hash - only when required by the transformer.

More context in #1325 (comment)

## Implementing config `watcher.unstable_lazySha1`
This diff introduces a new opt-in config that
 - Disables eager computation of `sha1` for all watched files.
 - Adds support in `Transformer` to accept a callback that asynchronously returns SHA1, and optionally file content.
 - Maintains support for the old sync API, for anyone using `Transformer` directly. This will likely be dropped in a coming major.

Along with the already landed, default-on [auto-saving cache](#1434), this should provide order of magnitude[1] faster startup on large projects, with no compromise to warm build perf, and very little slowdown in cold builds in most cases[2].

[1] Metro needs to watch file subtrees, but typically only a small proportion of those files are used in a build. By hashing up front, we can spend up to several minutes hashing files that will never be used.

[2] Cold file caches with warm transform caches - typically only when using a remote cache - may be observably slower due to the need to read and hash a file that wouldn't otherwise need to be read, though this still only moves the cost from startup to build. For truly cold builds, this change adds SHA1 computation time to transform time, but requires no additional IO. SHA1 computation is typically much faster than Babel transformation, and we might consider faster algorithms in future (SHA1 is Eden-native).


Changelog:
```
 - **[Experimental]**: Add `watcher.unstable_lazySha1` to defer SHA1 calculation until files are needed by the transformer

Differential Revision: D69373618
robhogan added a commit that referenced this pull request Feb 22, 2025
)

Summary:
## Stack
In this stack we're moving towards metro-file-map being able to *lazily* compute file metadata - in particular the SHA1 hash - only when required by the transformer.

More context in #1325 (comment)

## Implementing config `watcher.unstable_lazySha1`
This diff introduces a new opt-in config that
 - Disables eager computation of `sha1` for all watched files.
 - Adds support in `Transformer` to accept a callback that asynchronously returns SHA1, and optionally file content.
 - Maintains support for the old sync API, for anyone using `Transformer` directly. This will likely be dropped in a coming major.

Along with the already landed, default-on [auto-saving cache](#1434), this should provide order of magnitude[1] faster startup on large projects, with no compromise to warm build perf, and very little slowdown in cold builds in most cases[2].

[1] Metro needs to watch file subtrees, but typically only a small proportion of those files are used in a build. By hashing up front, we can spend up to several minutes hashing files that will never be used.

[2] Cold file caches with warm transform caches - typically only when using a remote cache - may be observably slower due to the need to read and hash a file that wouldn't otherwise need to be read, though this still only moves the cost from startup to build. For truly cold builds, this change adds SHA1 computation time to transform time, but requires no additional IO. SHA1 computation is typically much faster than Babel transformation, and we might consider faster algorithms in future (SHA1 is Eden-native).

Pull Request resolved: #1435

Changelog:
```
 - **[Experimental]**: Add `watcher.unstable_lazySha1` to defer SHA1 calculation until files are needed by the transformer

Reviewed By: GijsWeterings

Differential Revision: D69373618
facebook-github-bot pushed a commit that referenced this pull request Feb 22, 2025
)

Summary:
## Stack
In this stack we're moving towards metro-file-map being able to *lazily* compute file metadata - in particular the SHA1 hash - only when required by the transformer.

More context in #1325 (comment)

## Implementing config `watcher.unstable_lazySha1`
This diff introduces a new opt-in config that
 - Disables eager computation of `sha1` for all watched files.
 - Adds support in `Transformer` to accept a callback that asynchronously returns SHA1, and optionally file content.
 - Maintains support for the old sync API, for anyone using `Transformer` directly. This will likely be dropped in a coming major.

Along with the already landed, default-on [auto-saving cache](#1434), this should provide order of magnitude[1] faster startup on large projects, with no compromise to warm build perf, and very little slowdown in cold builds in most cases[2].

[1] Metro needs to watch file subtrees, but typically only a small proportion of those files are used in a build. By hashing up front, we can spend up to several minutes hashing files that will never be used.

[2] Cold file caches with warm transform caches - typically only when using a remote cache - may be observably slower due to the need to read and hash a file that wouldn't otherwise need to be read, though this still only moves the cost from startup to build. For truly cold builds, this change adds SHA1 computation time to transform time, but requires no additional IO. SHA1 computation is typically much faster than Babel transformation, and we might consider faster algorithms in future (SHA1 is Eden-native).

Pull Request resolved: #1435

Changelog:
```
 - **[Experimental]**: Add `watcher.unstable_lazySha1` to defer SHA1 calculation until files are needed by the transformer

Reviewed By: GijsWeterings

Differential Revision: D69373618

fbshipit-source-id: 2c67e0710b678e491b3760f5d235b4767339d841
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants