-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/mobile: stack traces are wrong on iOS and Android #22716
Comments
According to @eliasnaur at #20392 (comment) it could be due to missing frame pointers. There is a CL to add them to arm64 at https://go-review.googlesource.com/c/go/+/61511, however, it doesn't seem to solve the issue. See https://groups.google.com/d/topic/golang-dev/v1TxCJNemPY/discussion |
I spent some time to take a look at this, on Android this time. As expected however, frame pointers actually make the local variables visible by the debugger. |
In the case of a CGo crash, however, the stack trace seems ok, although truncated to |
Reading // systemstack runs fn on a system stack.
// If systemstack is called from the per-OS-thread (g0) stack, or
// if systemstack is called from the signal handling (gsignal) stack,
// systemstack calls fn directly and returns.
// Otherwise, systemstack is being called from the limited stack <----------
// of an ordinary goroutine. In this case, systemstack switches
// to the per-OS-thread stack, calls fn, and switches back.
// It is common to use a func literal as the argument, in order
// to share inputs and outputs with the code around the call
// to system stack:
//
// ... set up y ...
// systemstack(func() {
// x = bigcall(y)
// })
// ... use x ...
// |
I'm not a Go runtime developer, but one way to move this issue forward is to make it more concrete:
|
Yes, a self-contained case is important for post mortem of the wrong stack traces. |
It occurred to me that there is a small chance that the crash logger is somehow confused by the DWARF information the Go compiler generates. Have you tried to crash log a binary compiled with -ldflags="-w" on a GOARCH with framepointers (amd64)? |
hi, can you try the latest version of my patch (https://go-review.googlesource.com/c/go/+/61511) on darwin/arm64? |
So I've spent more time and indeed, the DWARF on
Indeed this may be confusing the unwinder, however, as tested by @eliasnaur, removing it doesn't seem to make the situation better either with, or without frame pointers, at least on Darwin. |
DWARF generation is completely disabled on darwin/arm and darwin/arm64. There's a *ld.FlagW = true // disable DWARF generation in cmd/link/internal/arm*/obj.go for objabi.Hdarwin. Removing that line for arm64 doesn't improve the Xcode backtrace in my informal crash test, however. |
The problem of incomplete / incorrect the stack is because the stack seems to be modified between the start of the panic ( On android using the lldb debugger:
From there the stack is correct. We can see the user code panicking in frame 1.
Now the stack doesn't show the user code anymore. With a simple program that starts a goroutine and panic on
Also the problem that shows everytime on
We can see that program even switched thread between the start of the panic and the raise of the signal. |
Thanks, @gwik. Other than the platform difference (linux vs android), the other difference that comes to mind is the buildmode. Have you reproduced the wrong stack trace on android/arm64 (or android/arm) with a regular static binary outside the JVM? Conversely, are stack traces correct on linux/amd64 with a Go c-shared library loaded into a C main program? |
@eliasnaur I'll do a check this afternoon (Paris time). Could you test by replacing the nil pointer with a good old |
@eliasnaur thanks for making this happen, this is great news. |
@aclements you did CL 93658 where a systemstack() is wrapping the dieFromSignal you see in the stack trace above. Is there a way to avoid that so the backtrace instead shows the correct call sequence leading to the panic()? |
One might consider the value of actually printing the full stack trace to stderr in that scenario. Perhaps a new From my user perspective, I'd be fine not having the stack trace printed at all in that scenario if it makes the LLDB backtraces good. Ideally, a |
Change https://golang.org/cl/110065 mentions this issue: |
CL 93658 moved stack trace printing inside a systemstack call to sidestep complexity in case the runtime is in a inconsistent state. Unfortunately, debuggers generating backtraces for a Go panic will be confused and come up with a technical correct but useless stack. This CL moves just the crash performing - typically a SIGABRT signal - outside the systemstack call to improve backtraces. Unfortunately, the crash function now needs to be marked nosplit and that triggers the no split stackoverflow check. To work around that, split fatalpanic in two: fatalthrow for runtime.throw and fatalpanic for runtime.gopanic. Only Go panics really needs crashes on the right stack and there is enough stack for gopanic. Example program: package main import "runtime/debug" func main() { debug.SetTraceback("crash") crash() } func crash() { panic("panic!") } Before: (lldb) bt * thread #1, name = 'simple', stop reason = signal SIGABRT * frame #0: 0x000000000044ffe4 simple`runtime.raise at <autogenerated>:1 frame #1: 0x0000000000438cfb simple`runtime.dieFromSignal(sig=<unavailable>) at signal_unix.go:424 frame #2: 0x0000000000438ec9 simple`runtime.crash at signal_unix.go:525 frame #3: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758 frame #4: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657 frame #5: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1 frame #6: 0x000000000042a980 simple at proc.go:1094 frame #7: 0x0000000000438ec9 simple`runtime.crash at signal_unix.go:525 frame #8: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758 frame #9: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657 frame #10: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1 frame #11: 0x000000000042a980 simple at proc.go:1094 frame #12: 0x00000000004268f5 simple`runtime.dopanic_m(gp=<unavailable>, pc=<unavailable>, sp=<unavailable>) at panic.go:758 frame #13: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657 frame #14: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1 frame #15: 0x000000000042a980 simple at proc.go:1094 frame #16: 0x000000000044bead simple`runtime.fatalpanic.func1 at panic.go:657 frame #17: 0x000000000044d066 simple`runtime.systemstack at <autogenerated>:1 After: (lldb) bt * thread #7, stop reason = signal SIGABRT * frame #0: 0x0000000000450024 simple`runtime.raise at <autogenerated>:1 frame #1: 0x0000000000438d1b simple`runtime.dieFromSignal(sig=<unavailable>) at signal_unix.go:424 frame #2: 0x0000000000438ee9 simple`runtime.crash at signal_unix.go:525 frame #3: 0x00000000004264e3 simple`runtime.fatalpanic(msgs=<unavailable>) at panic.go:664 frame #4: 0x0000000000425f1b simple`runtime.gopanic(e=<unavailable>) at panic.go:537 frame #5: 0x0000000000470c62 simple`main.crash at simple.go:11 frame #6: 0x0000000000470c00 simple`main.main at simple.go:6 frame #7: 0x0000000000427be7 simple`runtime.main at proc.go:198 frame #8: 0x000000000044ef91 simple`runtime.goexit at <autogenerated>:1 Updates #22716 Change-Id: Ib5fa35c13662c1dac2f1eac8b59c4a5824b98d92 Reviewed-on: https://go-review.googlesource.com/110065 Run-TryBot: Elias Naur <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Austin Clements <[email protected]>
Change https://golang.org/cl/170451 mentions this issue: |
…Store Passing test that shows Apple's symbols utility can now read DWARF data in go.o, after the fix in CL174538 Updates #31022 #22716 #31459 Change-Id: I56c3517ad6d0a9f39537182f63cef56bb198aa83 Reviewed-on: https://go-review.googlesource.com/c/go/+/170451 Reviewed-by: Than McIntosh <[email protected]>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?What did you do?
Bind an iOS framework using
gomobile bind
and create apanic
.What did you expect to see?
The real stack trace.
What did you see instead?
As commented in #20392 (comment):
The text was updated successfully, but these errors were encountered: