-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade from 2.4.0 to 2.4.2, trusted clusters failed to reconnect #1723
Comments
@russjones see this one and the #1733 |
@aashley These issues are occurring for you while using |
Yes, and tried to login with the web interface in a new browser |
was on the phone for last comment, most of my testing is with tsh on the console, included clearing the .tsh folder locally. To validate I tried the web interface as well as thats what most of my guys use and if it worked there I'd put up with the console issues for the other fixes, but it failed there as well. |
@aashley I've tried reproducing this issue and not had any luck. I also went through the diff between v2.4.0 and v2.4.2 and nothing stood out relating to the error you are seeing because that part of the code was largely not changed. Two questions:
|
no proxy for any of the remote clusters. Approximately 50 remote clusters. |
@aashley What backend are you using? |
haven't configured anything so the default, bolt. |
We have run some tests and tried to reproduce your issue. The fundamental problem is caching backend that is not goroutine friendly, and in case of many trusted clusters it overwrites the values corrupting local state. I think I have fixed this problem in 2.5.0. Is it easy for you to try out 2.5.0 configuration (2.5.0 stable is coming out on Monday)? Otherwise if it's hard to try out, I will think of some alternative solution, sorry for your troubles with this setup in OSS. |
We tried the upgrade to 2.4.3 and that appears to of resolved the issue. All our nodes are running 2.4.3 now and connecting successfully. |
k, closing this |
What happened: Upgraded from 2.4.0 to 2.4.2 previously connecting trusted clusters could no longer connect. Got the following errors in the master logs:
Feb 26 07:26:18 teleport teleport[31744]: WARN [PROXY:SER] this claims to be signed as authDomain %!v(MISSING), but no matching signing keys found remote:180.181.245.242:35200 user:7f79bd80-6eb8-4290-9b8d-069ce2f700aa.od-server-00141 reversetunnel/srv.go:514
Feb 26 07:26:18 teleport teleport[31744]: ERRO [PROXY:SER] failed to retrieve trusted keys, err: parsing time "2018-02-27T03:26:17.15149118ZZ": extra text: Z reversetunnel/srv.go:416
Feb 26 07:26:18 teleport teleport[31744]: WARN [PROXY:SER] failed authenticate host, err: ssh: certificate signed by unrecognized authority remote:210.10.211.14:58960 user:87b691e3-fb39-411f-93b9-64617275cec3.od-server-00182 reversetunnel/srv.go:502
Upgraded the remote cluster to 2.4.2 with same result. Only 2 of the remote clusters reconnected successfully. All others failed.
Reverted master to 2.4.0, remote clusters a mix of 2.4.0 and 2.4.2 all connecting correctly.
What you expected to happen: 2.4.0 remote nodes connect successfully.
How to reproduce it (as minimally and precisely as possible):
Not sure yet, upgraded my production cluster from 2.4.0 to 2.4.2. Cluster was originally installed with 2.0, looks like most of the nodes getting the above errors where originally installed pre 2.3.0. The others where timing out but nothing more detailed.
Environment:
teleport version
):Teleport v2.4.2 git:v2.4.2-0-g079d345 and Teleport v2.4.0 git:v2.4.0-0-ge9d6645tsh version
):Teleport v2.4.2 git:v2.4.2-0-g079d345 and Teleport v2.4.0 git:v2.4.0-0-ge9d6645Browser environment
Relevant Debug Logs If Applicable
The text was updated successfully, but these errors were encountered: