[BACKPORT] Hash token values for storage (#41792) #42220
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit changes how access tokens and refresh tokens are stored
in the tokens index.
Access token values are now hashed before being stored in the id
field of the
user_token
and before becoming part of the tokendocument id. Refresh token values are hashed before being stored
in the token field of the
refresh_token
. The tokens are hashedwithout a salt value since these are v4 UUID values that have
enough entropy themselves. Both rainbow table attacks and offline
brute force attacks are impractical.
As a side effect of this change and in order to support multiple
concurrent refreshes as introduced in #39631, upon refreshing an
<access token, refresh token> pair, the superseding access token
and refresh tokens values are stored in the superseded token doc,
encrypted with a key that is derived from the superseded refresh
token. As such, subsequent requests to refresh the same token in
the predefined time window will return the same superseding access
token and refresh token values, without hitting the tokens index
(as this only stores hashes of the token values). AES in GCM
mode is used for encrypting the token values and the key
derivation from the superseded refresh token uses a small number
of iterations as it needs to be quick.
For backwards compatibility reasons, the new behavior is only
enabled when all nodes in a cluster are in the required version
so that old nodes can cope with the token values in a mixed
cluster during a rolling upgrade.
This is a backport of #41792