Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline workflows can get into a bad state: failed to delete refresh token #1669

Closed
klarose opened this issue Mar 13, 2020 · 1 comment · Fixed by #1670
Closed

Offline workflows can get into a bad state: failed to delete refresh token #1669

klarose opened this issue Mar 13, 2020 · 1 comment · Fixed by #1670

Comments

@klarose
Copy link
Contributor

klarose commented Mar 13, 2020

We've been running into an issue occasionally with offline workflows. We aren't 100% sure what is causing it, but I can conceive of at least one case where it is possible.

The symptom is that when a user tries to log in, the attempt fails with the following error:

{"level":"error","msg":"failed to get refresh token: not found","time":"2020-03-12T20:52:12Z"}

This persists until the user's OfflineSession is deleted from the backend storage. This is a fairly serious bug because it requires manual intervention the correct.

The issue seems to occur when the OfflineSession falls out of sync with the RefreshToken table. In particular, if the refresh token corresponding to the ID stored for the client in the offline storage is not present, then attempts to log in will fail in this block of code:

			if oldTokenRef, ok := session.Refresh[tokenRef.ClientID]; ok {
				// Delete old refresh token from storage.
				if err := s.storage.DeleteRefresh(oldTokenRef.ID); err != nil {
					s.logger.Errorf("failed to delete refresh token: %v", err)
					s.tokenErrHelper(w, errServerError, "", http.StatusInternalServerError)
					deleteToken = true
					return
				}
			}

One way this could happen is if dex crashes or is killed between when the above code executes and the code which updates the OfflineSession executes. In that case, the OfflineSession will still point to the refresh token we just deleted, so future calls to the code block will fail, leading to the auth request to fail.

@klarose
Copy link
Contributor Author

klarose commented Mar 13, 2020

A simple fix to this is to just not fail if we fail to delete the refresh token due to it not being there. If it's not there, what is the problem? I'll submit a PR for this momentarily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant