-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRDB-28040 : JWKS fetch from jwks_uri #117054
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably want a cluster setting to turn this on, at least if there's no caching - for deployments that expect to keep their JWKS cluster settings up to date, we don't want to DoS their identity server if someone spams invalid tokens. If we had caching it would probably be fine to do without a cluster setting. (and if we had caching do we even want the hard-coded JWKS config or should it just always be loaded automatically?)
Reviewed 3 of 5 files at r1, 5 of 5 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @BabuSrithar and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 148 at r2 (raw file):
unverifiedToken, err := jwt.Parse(tokenBytes) if err != nil { log.Errorf(ctx, "token parse failed %v", err.Error())
In general, we don't like to both log and return errors. Elsewhere in this file we just do return errors.Newf(...)
, so I would remove all of the log.Errorf
calls.
pkg/ccl/jwtauthccl/authentication_jwt.go
line 172 at r2 (raw file):
keyExists := false if authenticator.mu.conf.jwks.Len() > 0 { jwsMessage, err := jws.Parse(tokenBytes)
This looks redundant - isn't jwsMessage
the same as unverifiedToken
?
pkg/ccl/jwtauthccl/authentication_jwt.go
line 173 at r2 (raw file):
if authenticator.mu.conf.jwks.Len() > 0 { jwsMessage, err := jws.Parse(tokenBytes) if err != nil || jwsMessage.Signatures() == nil || len(jwsMessage.Signatures()) == 0 {
In Go, len(nil)
is 0, so you don't need a separate nil
check.
pkg/ccl/jwtauthccl/authentication_jwt.go
line 190 at r2 (raw file):
// Now that both the issuer and key-id are matched, parse the token again to validate the signature. parsedToken, err := jwt.Parse(tokenBytes, jwt.WithKeySet(jwkSet), jwt.WithValidate(true), jwt.InferAlgorithmFromKey(true))
Did you look at the WithVerifyAuto
option? It looks like it might give us a way to plug our HTTP fetcher into the library itself so we don't have to do our own key matching. (and we only have to parse the input once). There's also a jwk.Cache
object that may be able to handle caching for us although I haven't looked at it closely enough to see whether it can support our combination of a pre-configured key set and the remote URL.
https://pkg.go.dev/github.com/lestrrat-go/jwx/[email protected]/jwt#WithVerifyAuto
pkg/ccl/testccl/authccl/testdata/jwt
line 24 at r2 (raw file):
# see authentication_jwt_test.go for examples of how to generate these tokens. jwt_cluster_setting issuers=issuer
Don't we still want a test that verifies that with jwt_cluster_setting enabled=true
but no issuers, we get an error?
pkg/ccl/testccl/authccl/testdata/jwt
line 48 at r2 (raw file):
# see authentication_jwt_test.go for examples of how to generate these tokens. jwt_cluster_setting issuers=["issuer","issuer2"]
As above, I think we still want a failure test here, in which the cluster setting specifies issuer but the token is from issuer2.
Previously, bdarnell (Ben Darnell) wrote…
Ah, ok. My intent was to log detailed errors for debugging provide only minimal error message to client, so that we avoid leaking security configs. I can move to "return errors.WithDetailf(...)" which seems to fit this requirement. Would that be ok ? |
I would prefer caching over another cluster setting here. We already have many settings just for JWT auth. Even after having caching, we might have to still keep JWKS config to ensure backward compatibility - right ? |
Previously, bdarnell (Ben Darnell) wrote…
They are different. |
dc0c5e5
to
f68918c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 148 at r2 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
Ah, ok. My intent was to log detailed errors for debugging provide only minimal error message to client, so that we avoid leaking security configs. I can move to "return errors.WithDetailf(...)" which seems to fit this requirement. Would that be ok ?
Made the change to errors.WithDetailf(...)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 173 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
In Go,
len(nil)
is 0, so you don't need a separatenil
check.
Done.
Previously, bdarnell (Ben Darnell) wrote…
I can add that. Related question : I found this whole file little redundant, given the test cases we have in Do you know if we need to maintain both these ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @BabuSrithar and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 148 at r2 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
Made the change to
errors.WithDetailf(...)
Hmm, I think error details are hidden from our own error telemetry but they're still exposed to the client. If that's the concern then we may still want to swallow these errors, but I don't think logging at every point that generates an error is the right way to do that. It would be better to just use errors.WithDetailf
here and have something at a higher layer catch all errors from the authenticator to decide how to log or return them. (This would also cover our other kinds of authenticators, and may already be done).
pkg/ccl/jwtauthccl/authentication_jwt.go
line 172 at r2 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
They are different.
unverifiedToken
is Token type andjwsMessage
is Message type.
Token type does not contain signatures.
Ah, I misread that one is jwt.Parse
and the other is jws.Parse
pkg/ccl/testccl/authccl/testdata/jwt
line 24 at r2 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
I can add that.
Related question : I found this whole file little redundant, given the test cases we have in
pkg/ccl/jwtauthccl/authentication_jwt_test.go
Do you know if we need to maintain both these ?
I'm not sure. I'm reluctant to delete any tests that are already written since it's unclear whether we'd still have full coverage if we deleted anything, but I don't think we need to add new tests in both places. Normally I'd prefer the data-driven test but they're hard to read and write for anything dealing with security tokens so it's probably better to make authentication_jwt_test.go our main test here.
One reason to keep at least a few data-driven tests is to ensure that we remain compatible with the old tokens hard-coded in this file. Otherwise we could change both the generator and the consumer so that the Go tests would pass but we wouldn't actually be compatible.
f68918c
to
2d4a091
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 148 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
Hmm, I think error details are hidden from our own error telemetry but they're still exposed to the client. If that's the concern then we may still want to swallow these errors, but I don't think logging at every point that generates an error is the right way to do that. It would be better to just use
errors.WithDetailf
here and have something at a higher layer catch all errors from the authenticator to decide how to log or return them. (This would also cover our other kinds of authenticators, and may already be done).
Have removed newly added errors.WithDetailf
and logs. Just returning what make sense to return to the client.
pkg/ccl/jwtauthccl/authentication_jwt.go
line 172 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
Ah, I misread that one is
jwt.Parse
and the other isjws.Parse
Done.
pkg/ccl/jwtauthccl/authentication_jwt.go
line 190 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
Did you look at the
WithVerifyAuto
option? It looks like it might give us a way to plug our HTTP fetcher into the library itself so we don't have to do our own key matching. (and we only have to parse the input once). There's also ajwk.Cache
object that may be able to handle caching for us although I haven't looked at it closely enough to see whether it can support our combination of a pre-configured key set and the remote URL.https://pkg.go.dev/github.com/lestrrat-go/jwx/[email protected]/jwt#WithVerifyAuto
We are using lestrrat-go/jwx/v1
The AutoVerify of this version seem to only supports auto fetch of jwks from jku field of JWT. The JWTs we issue do not have jku set. I am seeing if we can use other With* options to avoid double parsing / custom key matching.
https://pkg.go.dev/github.com/lestrrat-go/jwx/jws#VerifyAuto
Currently it only supports verification via \
jku` which will be fetched using the object specified in jws.JWKSetFetcher. Note that URLs in `jku` can only have https scheme.`
pkg/ccl/testccl/authccl/testdata/jwt
line 24 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
I'm not sure. I'm reluctant to delete any tests that are already written since it's unclear whether we'd still have full coverage if we deleted anything, but I don't think we need to add new tests in both places. Normally I'd prefer the data-driven test but they're hard to read and write for anything dealing with security tokens so it's probably better to make authentication_jwt_test.go our main test here.
One reason to keep at least a few data-driven tests is to ensure that we remain compatible with the old tokens hard-coded in this file. Otherwise we could change both the generator and the consumer so that the Go tests would pass but we wouldn't actually be compatible.
Have added this case to authentication_jwt_test.go > TestIssuerCheck
pkg/ccl/testccl/authccl/testdata/jwt
line 48 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
As above, I think we still want a failure test here, in which the cluster setting specifies issuer but the token is from issuer2.
This case is already there in authentication_jwt_test.go > TestIssuerCheck
2d4a091
to
9ceb3a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 5 files at r4, 4 of 4 files at r5, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @BabuSrithar and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 190 at r2 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
We are using lestrrat-go/jwx/v1
The AutoVerify of this version seem to only supports auto fetch of jwks from jku field of JWT. The JWTs we issue do not have jku set. I am seeing if we can use other With* options to avoid double parsing / custom key matching.
https://pkg.go.dev/github.com/lestrrat-go/jwx/jws#VerifyAuto
Currently it only supports verification via \
jku` which will be fetched using the object specified in jws.JWKSetFetcher. Note that URLs in `jku` can only have https scheme.`
OK, we can leave that until we upgrade to v2. Maybe we should start to add the jku field, or maybe there are other options we can combine.
pkg/ccl/jwtauthccl/authentication_jwt.go
line 178 at r5 (raw file):
_, keyExists = authenticator.mu.conf.jwks.LookupKeyID(tokenKeyId) } // if key not found in jwks, fetch remote jwks from issuerUrl and try again.
I'm not sure it makes sense to have both a statically-configured key set and a dynamic fallback. I think users will either want to set the static key in the cluster setting and only use that key, or they want it dynamically refreshed and then it doesn't make sense to have a maybe-stale key in the setting. It's probably better to have this as two different modes instead of a fallback.
Previously, bdarnell (Ben Darnell) wrote…
The original request had the ask to fallback. I think it makes sense in the current scenario where the jwks is periodically set by customers and the fallback will cover the windows where the setting is stale. When the setting is still valid, it avoid a network call in auth flow, it might help with latency. This will also be no-op for customers when they upgrade and does not need any more settings. But I dont have strong opinion. If you feel strongly about it, I can remove the fallback and make it either or, in which case we need to introduce an explicit setting. |
Previously, bdarnell (Ben Darnell) wrote…
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @BabuSrithar and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 178 at r5 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
The original request had the ask to fallback. I think it makes sense in the current scenario where the jwks is periodically set by customers and the fallback will cover the windows where the setting is stale. When the setting is still valid, it avoid a network call in auth flow, it might help with latency. This will also be no-op for customers when they upgrade and does not need any more settings. But I dont have strong opinion. If you feel strongly about it, I can remove the fallback and make it either or, in which case we need to introduce an explicit setting.
The original request asked for a fallback but I think they'd change their mind if they thought about it. In the fallback mode, you don't have a hard break if your key expires, but you do have a sudden decrease in performance, which for many users is just as bad (leading to metastable failure). I think it's better to make the remote fetch the normal case (which will make any performance impact visible).
Additionally it's simply easier to use since you just have to enable/disable instead of pasting in a big blob (and periodically update it). This is also more compatible with the options available in the v2 library.
We can keep the static JWKS setting for backwards compatibility, but users will need to opt in to the remote fetch to have any automatic refresh.
Previously, bdarnell (Ben Darnell) wrote…
Ok. I have added a new systems setting called "jwks_auto_fetch_enabled" for JWT Auth. |
Previously, BabuSrithar (BabuSrithar) wrote…
*systems setting = cluster setting |
92cf1c7
to
75c4c76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @BabuSrithar and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 178 at r5 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
*systems setting = cluster setting
In v2, you need to use jwk.Whitelist
in addition to the VerifyAuto
setting. Instead of a boolean for auto-fetch, let's make a setting for jwks_uri_allow_list
and you enable auto-fetch by setting a non-empty allow list.
pkg/ccl/jwtauthccl/settings.go
line 82 at r7 (raw file):
JWKSAutoFetchEnabledSettingName, "enables or disables automatic fetching of JWKs from the issuer's well-known endpoint. "+ "If this is enabled, the JWKSSetting will be ignored.",
JWKSSetting is an internal name but this is a user-facing help string. It should use the public setting name server.jwt_authentication.jwks
.
75c4c76
to
58a06ff
Compare
Previously, bdarnell (Ben Darnell) wrote…
Thanks for catching this. Fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 5 files at r6, 1 of 2 files at r7, 1 of 1 files at r8, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @BabuSrithar, @bdarnell, and @dhartunian)
This commit adds capability to fetch remote JWKS from issuer's jwks_uri endpoint. This will satisfy the requirement to have an ability to automatically fetch the new JWK when the existing JWK is rotated - without human intervention or custom scripts. Changes include 1. The existing order of token signature verification first and rest of claims next is modified to get issuer first and then the token signature verification. This change is requied to determine the issuer for which the jwks has to be fetched remotely. 2. Introduction of a new cluster setting called `server.jwt_authentication.jwks_auto_fetch.enabled` 3. Depending on the value of `server.jwt_authentication.jwks_auto_fetch.enabled` use JWKS configured through cluster setting or remotely fetch JWKS from jwks_uri of the issuer 4. Modification to exiting test cases to match the new order of verification steps. The change is backward compatible and not changes required in existing deployments and JWT Auth usage.
58a06ff
to
900408e
Compare
Previously, bdarnell (Ben Darnell) wrote…
I missed this comment somehow. Just noticed it. We already have issuers whitelist setting ( |
Previously, BabuSrithar (BabuSrithar) wrote…
https://cockroachlabs.atlassian.net/browse/CRDB-35990 to track v2 upgrade. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @BabuSrithar and @dhartunian)
pkg/ccl/jwtauthccl/authentication_jwt.go
line 178 at r5 (raw file):
Previously, BabuSrithar (BabuSrithar) wrote…
I missed this comment somehow. Just noticed it.
We already have issuers whitelist setting (
server.jwt_authentication.issuers
). We construct jwks_uri from issuer's openid-configuration only after verifying the issuer first. I thinkjwks_uri_allow_list
will be redundant toissuers
whitelist?
They're close but not quite the same. While the issuer's openid-discovery document is on its own domain, jwks_uri may point to a different domain and therefore we can't necessarily derive a JWKS URI whitelist from our issuer whitelist. But maybe that's OK and we can treat it as redundant: we can automatically allow any URI that is linked by our trusted issuer. (so when we use VerifyAuto in v2 we can just pass a whitelist that accepts anything). The boolean is OK with me, then.
Previously, bdarnell (Ben Darnell) wrote…
Yes, in V2 we can pass accept all whitelist and the JWKS URI if derived from a trusted source already. |
bors r+ |
Build succeeded: |
blathers backport 23.2 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 900408e to blathers/backport-release-23.2-117054: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
@BabuSrithar 👋 The commit message in this PR lacks a Can you please confirm:
|
@mikeCRL , sorry that I did not include This ticket does need doc update. We have a doc jira ticket created (https://cockroachlabs.atlassian.net/browse/DOC-9694) Regarding the doc text : Here is my note for Doc / PM - https://cockroachlabs.atlassian.net/browse/CRDB-28040?focusedCommentId=209665 |
@BabuSrithar Thank you. Is this OK for a release note? Any edits? Introduced a new cluster setting, |
@mikeCRL , that note looks good to me. |
Fixes CRDB-28040 ( JWKS fetch from jwks_uri )
This commit adds capability to fetch remote JWKS from issuer's jwks_uri endpoint. This will satisfy the requirement to have an ability to automatically fetch the new JWK when the existing JWK is rotated - without human intervention or custom scripts.
Changes include
The existing order of token signature verification first and rest of claims next is modified to get issuer first and then the token signature verification. This change is requied to determine the issuer for which the jwks has to be fetched remotely.
Introduction of a new cluster setting called
server.jwt_authentication.jwks_auto_fetch.enabled
Depending on the value of
server.jwt_authentication.jwks_auto_fetch.enabled
use JWKS configured through cluster setting or remotely fetch JWKS from jwks_uri of the issuerModification to exiting test cases to match the new order of verification steps.
The change is backward compatible and not changes required in existing deployments and JWT Auth usage.
Release note text: Clusters using Cluster Single Sign-on (SSO) with JSON web tokens (JWTs) can now optionally fetch signing keys from configured issuers instead of configuring static signing keys for each issuer. When the new cluster setting
server.jwt_authentication.jwks_auto_fetch.enabled
is set totrue
, signing keys are automatically fetched from the issuer using metadata published in its OpenID configuration. In this case, static signing keys inserver.jwt_authentication.jwks
are ignored. When automatic fetching is enabled, there may be a slight increase in network latency for each JWT authentication request, proportional to the latency between the cluster and the issuer's endpoint.