[compiler] useMemo calls directly induce memoization blocks #30177

mvitousek · 2024-07-02T01:02:11Z

Stack from ghstack (oldest at bottom):

[compiler][ez] Rename disableMemoizationForDebugging to just disableMemoization #30191
[compiler] Add wrapper functions to wrap change-detection storage and loading from the memo cache #30181
[compiler] Fewer assumptions about nonmutability when change detection enabled #30180
[compiler] Use dependencies from source for useMemo scopes #30179
[compiler] Drop useMemos in memoization disabled mode, don't bail from source-level memo blocks #30178
-> [compiler] useMemo calls directly induce memoization blocks #30177

Summary: To support the always-bailing-out and change-detection modes for the compiler, and to potentially support end-user codebases in some situations, we previously built a mode where user-level useMemos weren't dropped. This, however, results in codegen that is quite vastly different from the compiler's default mode, and which is likely to exercise different bugs.

This diff introduces a new mode that attempts to preserve user-level memoization in a way that is more compatible with the normal output of the compiler, dropping the literal useMemo calls and producing reactive scopes. The result of this is different from the existing @ensurePreserveMemoizationGuarantees in that the reactive scope is marked as originating from a useMemo, and cannot be merged with other memoization blocks, and that some operations are memoized that are not necessarily memoized today: specifically, obj.x and f(). This is to account for the fact that current useMemo calls may call non-idempotent functions inside useMemo--this is a violation of React's rules and is unsupported, but this mode attempts to support this behavior to make the compiler's behavior as close to user-level behavior as possible.

We build the user-level reactive scopes by simply adding all memoizable instructions between StartMemo and FinishMemo to their own reactive scope, possibly overwriting an existing scope. We do so before the scopes have been populated with dependencies or outputs so those passes can operate over these new scopes as normal.

Summary: To support the always-bailing-out and change-detection modes for the compiler, and to potentially support end-user codebases in some situations, we previously built a mode where user-level useMemos weren't dropped. This, however, results in codegen that is quite vastly different from the compiler's default mode, and which is likely to exercise different bugs. This diff introduces a new mode that attempts to preserve user-level memoization in a way that is more compatible with the normal output of the compiler, dropping the literal useMemo calls and producing reactive scopes. The result of this is different from the existing @ensurePreserveMemoizationGuarantees in that the reactive scope is marked as originating from a useMemo, and cannot be merged with other memoization blocks, and that some operations are memoized that are not necessarily memoized today: specifically, `obj.x` and `f()`. This is to account for the fact that current useMemo calls may call non-idempotent functions inside useMemo--this is a violation of React's rules and is unsupported, but this mode attempts to support this behavior to make the compiler's behavior as close to user-level behavior as possible. We build the user-level reactive scopes by simply adding all memoizable instructions between `StartMemo` and `FinishMemo` to their own reactive scope, possibly overwriting an existing scope. We do so before the scopes have been populated with dependencies or outputs so those passes can operate over these new scopes as normal. [ghstack-poisoned]

vercel · 2024-07-02T01:02:13Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
react-compiler-playground	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jul 2, 2024 5:09pm

react-sizebot · 2024-07-02T01:07:04Z

Comparing: 100dfd7...adc8705

Critical size changes

Includes critical production bundles, as well as any change greater than 2%:

Name	+/-	Base	Current	+/- gzip	Base gzip	Current gzip
oss-stable/react-dom/cjs/react-dom.production.js	=	6.68 kB	6.68 kB	+0.05%	1.83 kB	1.83 kB
oss-stable/react-dom/cjs/react-dom-client.production.js	=	497.99 kB	497.99 kB	=	89.27 kB	89.27 kB
oss-experimental/react-dom/cjs/react-dom.production.js	=	6.69 kB	6.69 kB	+0.05%	1.83 kB	1.83 kB
oss-experimental/react-dom/cjs/react-dom-client.production.js	=	502.81 kB	502.81 kB	=	89.97 kB	89.97 kB
facebook-www/ReactDOM-prod.classic.js	=	597.08 kB	597.08 kB	=	105.33 kB	105.33 kB
facebook-www/ReactDOM-prod.modern.js	=	571.42 kB	571.42 kB	=	101.27 kB	101.27 kB
test_utils/ReactAllWarnings.js	Deleted	62.88 kB	0.00 kB	Deleted	15.69 kB	0.00 kB

Significant size changes

Includes any change greater than 0.2%:

Expand to show

Name	+/-	Base	Current	+/- gzip	Base gzip	Current gzip
test_utils/ReactAllWarnings.js	Deleted	62.88 kB	0.00 kB	Deleted	15.69 kB	0.00 kB

Generated by 🚫 dangerJS against adc8705

Summary: To support the always-bailing-out and change-detection modes for the compiler, and to potentially support end-user codebases in some situations, we previously built a mode where user-level useMemos weren't dropped. This, however, results in codegen that is quite vastly different from the compiler's default mode, and which is likely to exercise different bugs. This diff introduces a new mode that attempts to preserve user-level memoization in a way that is more compatible with the normal output of the compiler, dropping the literal useMemo calls and producing reactive scopes. The result of this is different from the existing ensurePreserveMemoizationGuarantees in that the reactive scope is marked as originating from a useMemo, and cannot be merged with other memoization blocks, and that some operations are memoized that are not necessarily memoized today: specifically, `obj.x` and `f()`. This is to account for the fact that current useMemo calls may call non-idempotent functions inside useMemo--this is a violation of React's rules and is unsupported, but this mode attempts to support this behavior to make the compiler's behavior as close to user-level behavior as possible. We build the user-level reactive scopes by simply adding all memoizable instructions between `StartMemo` and `FinishMemo` to their own reactive scope, possibly overwriting an existing scope. We do so before the scopes have been populated with dependencies or outputs so those passes can operate over these new scopes as normal. [ghstack-poisoned]

mofeiZ · 2024-07-08T23:46:20Z

compiler/packages/babel-plugin-react-compiler/src/ReactiveScopes/MemoizeExistingUseMemos.ts

+    } else if (instruction.value.kind === "FinishMemoize") {
+      currentScope = null;
+    } else if (currentScope != null) {
+      const operands = collectMutableOperands(fn, instruction, true);


This logic overall makes sense but I'm a bit confused here -- why do we want to collect mutable or allocating instructions here instead of assigning a scope to all lvalues / operands. Similarly, when would we not want to create a reactive scope to represent the manual memo (i.e. the lazy scope creation logic)?

I can't imagine anyone writing this code, but I think this mode should probably preserve its useMemo semantics (playground link)

// @enablePreserveExistingManualUseMemo let myVar = 2; function increment() { "use no forget"; myVar += 1; } function useFoo() { useEffect(increment); const res = useMemo(() => myVar, []); return res; }

mofeiZ

Overall makes sense!

Short summary of some offline discussions: this mode will let us enable the compiler in "ultra safe" mode for initial testing and validation. This mode preserves all existing memo and maybe doesn't even insert new memo. This means:

This means we can start compiling in runtime validation before/without starting rollout for devX and testing. For current rollouts, both compiler on+off builds can now get runtime validation like hook guards or change detection.
This also lets us use a much bigger set of tests to sanity check the compiler. If we're preserving existing memo and adding no new memo, the compiler should essentially be the identity transform (and nothing should break).

As a followup, we probably should freeze / end the mutable range of useMemo dependencies and declarations (see playground example)

function useFoo() {
  const myVar = {};
  const res = useMemo(() => myVar.length, [myVar]);
  
  // myVar should be inferred as a read here
  maybeMutate(myVar);
  return res;
}

Note that this is not just important for avoiding bailouts but also to make sure these useMemos get their own (~~non-nested~~; non-overlapping) reactive scope ranges. It might also be worth writing a pass that runs after all the scope merging passes to ensure that source scope ranges never overlap with other scope ranges (Disregard, nesting source scopes within other scopes is correct as the source memo is still retained)

Here is another example why we might want to end mutable ranges of scope dependencies before the scope begins. The implementation in this PR derives the mutable range of the newly created scope to be {min(scopeOperands.map(o => o.mutableRange.start), max(scopeOperands.map(o => o.mutableRange.end)} (without freezing deps).

This means that the newly created scopes may begin before the useMemo. If we expect to use this mode as an identity transform (i.e. creating only source scopes), this doesn't quite work because the new memo scopes will include additional instructions.
If the newly created scopes overlap with other scopes (overlap := overlapping but not nested mutable ranges), MergeOverlappingScopes must merge it to produce valid scope blocks.

function useFoo(a) {
  // here, x gets assigned scope 0 with range (0, 2) by InferReactiveScopeVars.
0:  const x = { a }; 
  // and y gets assigned scope 1 with range (0, 2) by MemoizeExistingUseMemos.
  // note that the resulting output has scopes 0 and 1 merged because they have
  // identical mutable ranges
1:  const y = useMemo(() => foo(x), []);
2:  return y;
}

As a sidenote, the PromoteUsedTemporaries pass (can't think of any others right now but perhaps @gsathya can correct) relies on dependencies to be valid to produce valid programs. This should be fine as useMemo operands shouldn't consume rvalues ("unnamed" identifiers) from prior instructions. We should also just rewrite this logic anyways (see bug repro), but wanted to note that manually manipulating scopes might make other passes / existing invariants more complex to reason about

mofeiZ · 2024-07-09T00:11:58Z

compiler/packages/babel-plugin-react-compiler/src/HIR/HIR.ts

@@ -1429,6 +1429,8 @@ export type ReactiveScope = {
  merged: Set<ScopeId>;

  loc: SourceLocation;
+
+  source: boolean;


We probably should check source in all passes that act on scopes.

// flattenScopesWithHooksOrUse function useFoo() { // Assume that useARandomFunction is not a hook const res = useMemo(() => useARandomFunction(), []); return res; }

// flattenReactiveLoops function useFoo() { const res = [] for (let i = 0; i < 5; i++) { res.push(useMemo(() => foo(), [])); } return res; }

(etc)

josephsavona

overall makes sense, but i have similar questions to @mofeiZ about the details of assigning the scopes.

josephsavona · 2024-07-11T03:06:58Z

compiler/packages/babel-plugin-react-compiler/src/HIR/HIR.ts

@@ -1429,6 +1429,8 @@ export type ReactiveScope = {
  merged: Set<ScopeId>;

  loc: SourceLocation;
+
+  source: boolean;


This name feels a bit too abbreviated to be immediately obvious what it means (when looking at a usage site in particular). Maybe fromManualMemoization or something descriptive?

josephsavona · 2024-07-11T03:14:39Z

compiler/packages/babel-plugin-react-compiler/src/ReactiveScopes/MemoizeExistingUseMemos.ts

+  let currentScope = scope;
+  for (const instruction of block.instructions) {
+    if (instruction.value.kind === "StartMemoize") {
+      currentScope = { kind: "pending", id: nextId() };


let's add an assertion that currentScope == null, since useMemo can't be nested

josephsavona · 2024-07-11T03:23:59Z

.../babel-plugin-react-compiler/src/ReactiveScopes/MergeReactiveScopesThatInvalidateTogether.ts

+  if (current.scope.source !== next.scope.source) {
+    return false;
+  }


hmm, this seems like it would allow merging consecutive manual memoization units, is that what we want? i would have expected we'd want to leave them alone, return false if source is true for current/next

josephsavona · 2024-07-11T03:28:24Z

compiler/packages/babel-plugin-react-compiler/src/ReactiveScopes/MemoizeExistingUseMemos.ts

+let ctr = 0;
+function nextId(): number {
+  return ctr++;
+}


can we make this not a module-global, so that the count resets? Or even better, it seems like we should be able to just get env.nextScopeId as soon as we encounter a StartMemoize instruction, since we know that will map 1:1 with the scopes we create?

@mofeiZ

Summary: This work enhances the don't-drop-memoization mode to not memoize the arguments to useMemo and useCallback (or technically, to memoize them in the same block as the hook call -- but since hooks can't be memoized, this is equivalent to never memoizing them). The advantage here is that we preserve the source-level code structure around memoization callbacks more, and avoid the compiler bug of having function expression dependencies be unconditional when their actual semantics are conditional. This conceptually replaces some of the work of #30177: some of that was motivated by avoiding the deps bug, and the rest of the motivation was to better exercise the compiler's semantics, which was really happening as much as I'd hoped, as @mofeiZ explained to me in that PR and offline! [ghstack-poisoned]

…emos, don't memoize arguments" Summary: This work enhances the don't-drop-memoization mode to not memoize the arguments to useMemo and useCallback (or technically, to memoize them in the same block as the hook call -- but since hooks can't be memoized, this is equivalent to never memoizing them). The advantage here is that we preserve the source-level code structure around memoization callbacks more, and avoid the compiler bug of having function expression dependencies be unconditional when their actual semantics are conditional. This conceptually replaces some of the work of #30177: some of that was motivated by avoiding the deps bug, and the rest of the motivation was to better exercise the compiler's semantics, which was really happening as much as I'd hoped, as mofeiZ explained to me in that PR and offline! [ghstack-poisoned]

…ize arguments" Summary: This work enhances the don't-drop-memoization mode to not memoize the arguments to useMemo and useCallback (or technically, to memoize them in the same block as the hook call -- but since hooks can't be memoized, this is equivalent to never memoizing them). The advantage here is that we preserve the source-level code structure around memoization callbacks more, and avoid the compiler bug of having function expression dependencies be unconditional when their actual semantics are conditional. This conceptually replaces some of the work of #30177: some of that was motivated by avoiding the deps bug, and the rest of the motivation was to better exercise the compiler's semantics, which was really happening as much as I'd hoped, as mofeiZ explained to me in that PR and offline! [ghstack-poisoned]