Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta_framegap.lua deopt crash #1571

Closed
vext01 opened this issue Jan 29, 2025 · 5 comments · Fixed by #1582 · May be fixed by #1593
Closed

meta_framegap.lua deopt crash #1571

vext01 opened this issue Jan 29, 2025 · 5 comments · Fixed by #1582 · May be fixed by #1593
Assignees

Comments

@vext01
Copy link
Contributor

vext01 commented Jan 29, 2025

With today's yklua (e99ff2) and yk (3dc428) an OpenResty test causes a deopt crash:

$ YK_OPT=0 YKD_SERIALISE_COMPILATION=1 ~/research/yklua/src/lua luajit2-test-suite/test/misc/meta_framegap.lua
thread '<unnamed>' panicked at ykrt/src/compile/jitc_yk/codegen/x64/deopt.rs:40:30:
no entry found for key

To get the test file, you can use the branch from this PR

(Once it's merged, you can just use main)

@ltratt
Copy link
Contributor

ltratt commented Jan 30, 2025

Now that is a very interesting error: most obviously, it shouldn't ever be possible to reference a guard that doesn't exist, but the most likely way this sort of bug would occur is that we pick up the wrong guard. The fact that we hit a guard that doesn't exist will make fixing this much easier. Does the bug happen deterministically? If so, this is a wonderful shrinkray candidate!

@vext01
Copy link
Contributor Author

vext01 commented Jan 30, 2025

It's very deterministic, so I have high hopes of reducing it and tracking down the cause.

@vext01
Copy link
Contributor Author

vext01 commented Jan 31, 2025

More info on this crash.

It is 100% deterministic with YK_OPT=0 YKD_SERIALISE_COMPILATION=1 which helps.

There are no side-traces involved. There are two traces executed, call them T1 and T2.

T1 has 83 deopt exits. T2 has only 11.

T1 is executed once at the beginning, then only T2 is executed after that (many times).

Here is a reduced test case:

A = setmetatable({}, {
	__add = function(A, B)
		if B > 200 then
			for B = 1, B do
			end
		end
	end,
})
function C(A, B)
	do
		return A + B
	end
end
for B = 1, 300 do
	C(A, B)
end

Putting prints in __yk_guardcheck() the crash appears to be caused when we look up guard index 56 in a trace (T1) with guard indices 0..10:

guard index 56
ctr has these guard indices: [4, 0, 7, 10, 1, 2, 8, 9, 3]

Looking at the jit-asm confirms 56 is well out of bounds, but there is another compiled trace for which that'd be a valid index (i.e. T2). This made me question whether the thread's idea of the "currently running trace". Is it using a guardindex for T2 when running T2?

Putting more prints around and pumping up the logging I can see something else fishy. A common pattern in the output is:

...
yk-jit-event: enter-jit-code
set a running trace: 0x7fa1851d9490
root is None
__yk_guardcheck()
ctr addr: 0x7fa1851d9490
guard index 8
ctr has these guard indices: [4, 0, 7, 10, 1, 2, 8, 9, 3]
yk-jit-event: deoptimise
replace stack
control point
yk-jit-event: enter-jit-code
...

That all makes sense to me.

But when we crash, we deviate from this pattern:

...
yk-jit-event: enter-jit-code
set a running trace: 0x7fa1851d9490
root is None
__yk_guardcheck()
ctr addr: 0x7fa1851d9490
guard index 8
ctr has these guard indices: [4, 0, 7, 10, 1, 2, 8, 9, 3]
yk-jit-event: deoptimise
replace stack
__yk_guardcheck()      <--------------- How does this happen????
ctr addr: 0x7fa1851d9490
guard index 56
ctr has these guard indices: [4, 0, 7, 10, 1, 2, 8, 9, 3]

thread '<unnamed>' panicked at ykrt/src/compile/jitc_yk/codegen/x64/deopt.rs:44:30:
no entry found for key
...

After replace stack we somehow end up in __yk_guardcheck() (which you'd only expect to be called from JITted code), but we've not seen a control point or enter-jit-code since we last deopted, which doesn't match my mental model of how the system should work. It looks to me like the system should be in the interpreting state.

A further reduced input is:

A = setmetatable({}, {
	__add = function(A, B)
		if B > 00 then
			for B = 0, B do
			end
		end
	end,
})
function C(A, B)
	do
		return A + B
	end
end
for B = 0, 200 do
	C(A, B)
end

This one more often that not gives a different error:

thread '<unnamed>' panicked at ykrt/src/compile/jitc_yk/trace_builder.rs:433:69:
no entry found for key

(It must also give the original error from before, or it wouldn't have passed the interestingness test. The tries 500 times to get the deopt.rs:40, requiring it to only show once to be interesting)

Diff between those two inputs:

--- meta_framegap.lua	Fri Jan 31 15:45:00 2025
+++ meta_framegap_werid.lua	Fri Jan 31 15:56:38 2025
@@ -1,7 +1,7 @@
 A = setmetatable({}, {
 	__add = function(A, B)
-		if B > 200 then
-			for B = 1, B do
+		if B > 00 then
+			for B = 0, B do
 			end
 		end
 	end,
@@ -11,6 +11,6 @@
 		return A + B
 	end
 end
-for B = 1, 300 do
+for B = 0, 200 do
 	C(A, B)
 end

@vext01
Copy link
Contributor Author

vext01 commented Feb 3, 2025

Here's what's happening:

A trace, T1 is compiled and executed, which sets the tstate to Executing. The trace contains a call to luaV_execute (this is likely lua interpreting a call to a lua function). We then compile another trace inside this nested luaV_execute and execute it, which sets the tstate to Executing again. The stack looks something like this now:

  • luaV_execute (1)
  • trace_1
  • luaV_execute (2)
  • trace_2

We then deopt from trace_2 back into luaV_execute (2). Note, that it appears that we DO NOT set tstate back to Interpreting (should we?). The state remains in Executing. We then return from luaV_execute (2) back into trace_1. At this point the current running trace in tstate = Executing is still trace_2, because we haven't updated tstate. Then a guard fails in trace_1 which looks up the running trace, gets trace_2 and bad things happen.

Question 1) Should we allow compiling traces from within luaV_execute calls that are called from trace?

Question 2) If we do, and our feeling is we do, then how do we make sure that the running trace is always up to date. A simple fix would be to have a vector and push to it when executing a trace, and popping when deopting. We probably also want to set tstate back to Interpreting at deopt time.

@vext01
Copy link
Contributor Author

vext01 commented Feb 4, 2025

Small test case for this:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <yk.h>
#include <yk_testing.h>

void f(YkMT *mt, int who, YkLocation *loc1, YkLocation *loc2, int i) {
  fprintf(stderr, "enter\n");
  while (i > 0) {
    yk_mt_control_point(mt, loc1);
    if (who) {
      if (i == 1) {
        fprintf(stderr, "a\n");
      }
      if (i == 2) {
        fprintf(stderr, "b\n");
      }
      if (i == 3) {
        fprintf(stderr, "c\n");
      }
    }
    fprintf(stderr, "%d\n", i);
    i -= 1;
    if (loc2 != NULL) {
      f(mt, 0, loc2, NULL, i);
    }
  }
  fprintf(stderr, "exit\n");
}

int main(int argc, char **argv) {
  YkMT *mt = yk_mt_new(NULL);
  yk_mt_hot_threshold_set(mt, 0);
  YkLocation loc1 = yk_location_new();
  YkLocation loc2 = yk_location_new();
  int i = 6;
  NOOPT_VAL(loc1);
  NOOPT_VAL(loc2);
  NOOPT_VAL(i);
  f(mt, 1, &loc1, &loc2, i);
  yk_location_drop(loc1);
  yk_location_drop(loc2);
  yk_mt_shutdown(mt);
  return (EXIT_SUCCESS);
}

ltratt added a commit to ltratt/yk that referenced this issue Feb 4, 2025
Previously we -- well, this one is entirely my fault, so "I"! -- tracked
per-thread meta-tracing state as a single `MTThreadState` that we updates as
necessary. This doesn't work when we have nested execution, tracing and
the like, as the bug in ykjit#1571 highlighted.

This commit moves us to a stack of `MTThreadState`s. The basic idea is
that the stack always has at least one element: `Interpreting`. As we go
through other states, we push / pop as appropriate. The test below (from
Edd and Lukas) fails on `master` but is fixed by this commit.

The implementation is a bit more awkward than one might hope as naive implementations either:

  1. Spread lots of knowledge about the stack around the code. That's a
     disaster waiting to happen.
  2. Run into borrow checker problems.

This commit gets around this in two phases:

  1. We pass closures to `MTThread` which can peek at the current
     `MTThreadState` (which isn't `Copy`!).
  2. Those closures return the "things" we need to update outside that
     context.

This is a bit awkward, and perhaps there's a better API, but this one is
at least safe.

Co-authored-by: Edd Barrett <[email protected]>
Co-authored-by: Lukas Diekmann <[email protected]>
ltratt added a commit to ltratt/yk that referenced this issue Feb 5, 2025
Previously we -- well, this one is entirely my fault, so "I"! -- tracked
per-thread meta-tracing state as a single `MTThreadState` that we updates as
necessary. This doesn't work when we have nested execution, tracing and
the like, as the bug in ykjit#1571 highlighted.

This commit moves us to a stack of `MTThreadState`s. The basic idea is
that the stack always has at least one element: `Interpreting`. As we go
through other states, we push / pop as appropriate. The test below (from
Edd and Lukas) fails on `master` but is fixed by this commit.

The implementation is a bit more awkward than one might hope as naive implementations either:

  1. Spread lots of knowledge about the stack around the code. That's a
     disaster waiting to happen.
  2. Run into borrow checker problems.

This commit gets around this in two phases:

  1. We pass closures to `MTThread` which can peek at the current
     `MTThreadState` (which isn't `Copy`!).
  2. Those closures return the "things" we need to update outside that
     context.

This is a bit awkward, and perhaps there's a better API, but this one is
at least safe.

Co-authored-by: Edd Barrett <[email protected]>
Co-authored-by: Lukas Diekmann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants