Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repeatedly calling jwk.(Cache).Register() with WithWaitReady(false) may cause a deadlock #1250

Closed
shasderias opened this issue Dec 12, 2024 · 5 comments
Assignees
Labels

Comments

@shasderias
Copy link

shasderias commented Dec 12, 2024

Describe the bug

Calling jwk.(Cache).Register() many times with the jwk.WithWaitReady(false) option may result in a deadlock.

Please attach the output of go version

go version go1.23.4 darwin/arm64

(also reproducible on 1.23.1)

To Reproduce / Expected behavior

Repro Repo

Run the following test.

package main

import (
	"context"
	"fmt"
	"testing"

	"github.com/lestrrat-go/httprc/v3"
	"github.com/lestrrat-go/jwx/v3/jwk"
)

func TestRepro(t *testing.T) {
	ctx := context.Background()

	c, err := jwk.NewCache(ctx, httprc.NewClient())
	if err != nil {
		panic(err)
	}

	for i := 0; i < 50; i++ {
		u := fmt.Sprintf("https://www.googleapis.com/oauth2/v3/certs/%d", i)
		fmt.Println("registering", u)
		if err := c.Register(ctx, u, jwk.WithWaitReady(false)); err != nil {
			panic(err)
		}
	}
}

Expected: the program to terminate.
Actual: the program deadlocks (see log below).

The issue is also reproducible with:

  • a simple go test -v run from the command line;
  • real URLs for endpoints that return valid JWKS files.

The issue does not appear to reproduce with go run, although it may be the case that the problem is still present, just less likely to occur.

The program does not always deadlock in the same iteration of the for loop.

GOROOT=<myHomeDir>go/pkg/mod/golang.org/[email protected] #gosetup
GOPATH=<myHomeDir>go #gosetup
<myHomeDir>go/pkg/mod/golang.org/[email protected]/bin/go test -c -o <myHomeDir>Library/Caches/JetBrains/IntelliJIdea2024.3/tmp/GoLand/___TestRepro_in_jwk_deadlock_repro.test jwk_deadlock_repro #gosetup
<myHomeDir>go/pkg/mod/golang.org/[email protected]/bin/go tool test2json -t <myHomeDir>Library/Caches/JetBrains/IntelliJIdea2024.3/tmp/GoLand/___TestRepro_in_jwk_deadlock_repro.test -test.v=test2json -test.paniconexit0 -test.run ^\QTestRepro\E$ #gosetup
=== RUN   TestRepro
registering https://www.googleapis.com/oauth2/v3/certs/0
registering https://www.googleapis.com/oauth2/v3/certs/1
registering https://www.googleapis.com/oauth2/v3/certs/2
registering https://www.googleapis.com/oauth2/v3/certs/3
registering https://www.googleapis.com/oauth2/v3/certs/4
registering https://www.googleapis.com/oauth2/v3/certs/5
registering https://www.googleapis.com/oauth2/v3/certs/6
registering https://www.googleapis.com/oauth2/v3/certs/7
registering https://www.googleapis.com/oauth2/v3/certs/8
registering https://www.googleapis.com/oauth2/v3/certs/9
registering https://www.googleapis.com/oauth2/v3/certs/10
registering https://www.googleapis.com/oauth2/v3/certs/11
registering https://www.googleapis.com/oauth2/v3/certs/12
registering https://www.googleapis.com/oauth2/v3/certs/13
registering https://www.googleapis.com/oauth2/v3/certs/14
registering https://www.googleapis.com/oauth2/v3/certs/15
registering https://www.googleapis.com/oauth2/v3/certs/16
registering https://www.googleapis.com/oauth2/v3/certs/17
registering https://www.googleapis.com/oauth2/v3/certs/18
registering https://www.googleapis.com/oauth2/v3/certs/19
registering https://www.googleapis.com/oauth2/v3/certs/20
registering https://www.googleapis.com/oauth2/v3/certs/21
registering https://www.googleapis.com/oauth2/v3/certs/22
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan receive]:
testing.(*T).Run(0x14000110680, {0x1050075b6?, 0x14000153b48?}, 0x105148c20)
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1751 +0x328
testing.runTests.func1(0x14000110680)
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:2168 +0x40
testing.tRunner(0x14000110680, 0x14000153c68)
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1690 +0xe4
testing.runTests(0x1400000c2a0, {0x105363e50, 0x1, 0x1}, {0xc800000000000000?, 0xc8e7685b4c65a659?, 0x0?})
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:2166 +0x3ac
testing.(*M).Run(0x1400007ebe0)
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:2034 +0x588
main.main()
	_testmain.go:45 +0x90

goroutine 7 [select]:
github.com/lestrrat-go/httprc/v3.sendBackend[...]({0x10514ee30, 0x1053a2da0}, 0x14000120460, {0x14000121420, {0x1051510c0, 0x1400012f9e0}, {0x0, 0x0}}, 0x14000121420)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/controller.go:102 +0x100
github.com/lestrrat-go/httprc/v3.(*controller).Add(0x14000120850, {0x10514ee30, 0x1053a2da0}, {0x1051510c0, 0x1400012f9e0}, {0x14000028e40?, 0x1400006e280?, 0x0?})
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/controller.go:154 +0x198
github.com/lestrrat-go/jwx/v3/jwk.(*Cache).Register(0x140000285c0, {0x10514ee30, 0x1053a2da0}, {0x1400001cb70, 0x2d}, {0x14000104f38?, 0xb9fb12f6?, 0x10537c540?})
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/jwx/[email protected]/jwk/cache.go:152 +0x4d0
jwk_deadlock_repro.TestRepro(0x14000110820?)
	<myHomeDir>dev/jwk_deadlock_repro/repro_test.go:23 +0x1bc
testing.tRunner(0x14000110820, 0x105148c20)
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1690 +0xe4
created by testing.(*T).Run in goroutine 1
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/testing/testing.go:1743 +0x314

goroutine 8 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:155 +0x644

goroutine 9 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:155 +0x644

goroutine 10 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:155 +0x644

goroutine 11 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:155 +0x644

goroutine 12 [select]:
github.com/lestrrat-go/httprc/v3.worker.Run({{0x10514bdf8, 0x10537b3c0}, 0x14000120460, 0x140001204d0, 0x14000120540, {0x10514c180, 0x1053a2da0}, {0x10514bdd8, 0x1053a2da0}}, {0x10514eed8?, ...}, ...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/worker.go:32 +0x2ac
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:155 +0x644

goroutine 13 [select]:
github.com/lestrrat-go/httprc/v3.sendWorker({0x10514eed8, 0x140000101e0}, 0x140001204d0, {0x1051510c0, 0x1400012e900})
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/backend.go:113 +0x94
github.com/lestrrat-go/httprc/v3.(*controller).loop(0x14000120850, {0x10514eed8, 0x140000101e0}, 0x1400000e4e0)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/backend.go:162 +0x350
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:175 +0x8d8

goroutine 14 [semacquire]:
sync.runtime_Semacquire(0x0?)
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/runtime/sema.go:71 +0x2c
sync.(*WaitGroup).Wait(0x1400000e4e0)
	<myHomeDir>go/pkg/mod/golang.org/[email protected]/src/sync/waitgroup.go:118 +0x74
github.com/lestrrat-go/httprc/v3.(*Client).Start.func4(...)
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:178
created by github.com/lestrrat-go/httprc/v3.(*Client).Start in goroutine 7
	<myHomeDir>go/pkg/mod/github.com/lestrrat-go/httprc/[email protected]/client.go:177 +0x940


Process finished with the exit code 1
@lestrrat
Copy link
Collaborator

Thanks, I'll look into it tomorrow.

@lestrrat
Copy link
Collaborator

@shasderias Hmmm, I'm not sure if this is blocking or if it's just ... slow. In my testing I did observe things timing out, but after adding some print logs, the call to AddResource seem to all be returning eventually, in order.

And if I give it enough workers, let's just say 50 (via httprc.WithWorkers(50)) it seems to pass immediately.

I acknowledge there's something to be desired, but I have a feeling it's just bad performance. (which I'll work on, but just want to make sure if we're seeing the same symptoms)

@shasderias
Copy link
Author

I'm a little short on time now, but my gut feeling is:

  • this is a true deadlock as in, I think there's a genuine race condition that has a chance of causing a program to halt;
  • this is probably not (just) a performance problem;
  • I'm not surprised that adding print statements make the problem goes away, as terminal I/O takes a comparatively "long time", which can mask the deadlock;
  • the preceding point is also true for adding workers; one goroutine for each resource monitored = no chance for the goroutines to "fight" over jobs.

I've also managed to repro on go run and on a Windows machine on 1.23.1.

I'll have a bit more time after the 19th, will try to validate the above/investigate further then if it's still necessary.

Copy link
Contributor

This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 30, 2024
Copy link
Contributor

github-actions bot commented Jan 6, 2025

This issue was closed because it has been stalled for 7 days with no activity. This does not mean your issue is rejected, but rather it is done to hide it from the view of the maintains for the time being. Feel free to reopen if you have new comments

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants