Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oauth2: Support for Pairwise Subject Identifier Type #950

Closed
michalwojciechowski opened this issue Jul 25, 2018 · 20 comments
Closed

oauth2: Support for Pairwise Subject Identifier Type #950

michalwojciechowski opened this issue Jul 25, 2018 · 20 comments
Labels

Comments

@michalwojciechowski
Copy link

Do you want to request a feature or report a bug?
A new feature.

What is the expected behavior?
It would be great to see Hydra supporting Pairwise Subject Identifier Type for the clients who define subject_types_supported = pairwise only. Ideally we would like see all the features listed on http://openid.net/specs/openid-connect-core-1_0.html#SubjectIDTypes to be present with the following concerns:

  • paragraph http://openid.net/specs/openid-connect-core-1_0.html#PairwiseAlg is out of Hydra's scope and the responsibility lies in our hands. It would indicate that Hydra is accountable only for receiving a pair of local account id subject along with a client specific subject & keeping the relation between them on its own. Can you please confirm that this would be the expected behavior?

As a POC we've tried to manually modify the subject after the authentication flow is done, unfortunately Hydra does not support it & ends the following error message: "Subject from payload does not match subject from previous authentication".

OpenID Certification does not cover this particular feature besides checking the presence of subject_types_supported (re #689).

Thank you in advance @arekkas!

Which version of the software is affected?
v1.0.0-beta.7 along with all the previous ones

@aeneasr
Copy link
Member

aeneasr commented Jul 25, 2018

That's a sensible request. I think there are two ways to support this (we can combine both):

  1. We use sector_identifier_uri (for clients with sector IDs) and redirect_uri (clients without sector IDs) as well as the subject value and compute an SHA-256 hash to be used as subject using the system secret as salt. This would be something encouraged and defined in the specification.
  2. We provide a way to specify another parameter during consent acceptance called subject_pairwise which allows the consent endpoint to specify the subject pairwise ID.

As we're implementing the OIDC Dynamic Client Registration, clients need to specify subject_type=pairwise during registration in order to use this feature. I am currently unsure if the subject_pairwise key should be disabled/ignored for clients that have subject_type=public or if setting subject_pairwise forces the pairwise subject type, regardless of the client's subject_type.

This might actually be something we'd like to configure using an environment variable, which could be something along the lines of OIDC_SUPPORTED_SUBJECT_TYPES=pairwise, OIDC_SUPPORTED_SUBJECT_TYPES=pairwise,public, OIDC_SUPPORTED_SUBJECT_TYPES=public. That way we can enforce clients to use only the pairwise algorithm which increases data privacy, while still being able to support both public,pairwise as well as only public.

What do you think?

@aeneasr
Copy link
Member

aeneasr commented Jul 25, 2018

Note to self: This should work with id_token_hint as well - where id_token_hint will have a different subject than stored in the cookie, for example. This would imply that the subject_pairwise value must be set during authentication in order to properly hydrate the auth session. This would also imply that if subject_pairwise is set, authentication with id_token_hint will break if a different client initiates the authorize code flow. It will break because the algorithm computing the subject value is opaque to hydra. In turn, this would work where hydra generates the pairwise subject identifier.

@michalwojciechowski
Copy link
Author

This is exactly what I was going to suggest, good job!

I fully agree, both ways should be available to us, especially since we might need to have full control over the creation of the "external subject". This is an option I would prefer to follow in the first place, hoping that Hydra wouldn't validate nor enforce the given value (Hydra shouldn't know anything about the algorithm we are using, right?).

When it comes to the topic of supporting both types (public, pairwise) at the same time, I can't find a valid use case for now, perhaps it may come handy in the future? If you decide to support both types at the same time per client, we would need a flag to determine which subject_type is used during the authentication phase.

I don't mind becoming a beta tester of this future. Thanks!

@damian0o
Copy link

damian0o commented Jul 26, 2018

Both solutions sound good to me.
As your note stated if first option is used and hydra does not know anything about underlying algorithm, some functionalities could break within hydra. On the other hand if second option will be follow services that are receiving sub sent by a client to fulfill some requirements (without access token) could not match resource owner. It would be great to have option to get "external subject" from hydra in that case.

@aeneasr
Copy link
Member

aeneasr commented Jul 26, 2018

I would be great to have option to get "external subject" from hydra in that case.

That's a good point, we'll add this to the things to think about when implementing this.

When it comes to the topic of supporting both types (public, pairwise) at the same time, I can't find a valid use case for now, perhaps it may come handy in the future? If you decide to support both types at the same time per client, we would need a flag to determine which subject_type is used during the authentication phase.

If both types are supported by the server, then the proper algorithm is chosen by checking the subject_type set by the OAuth2 Client during registration.

aeneasr pushed a commit that referenced this issue Aug 6, 2018
This patch introduces field `subject_type` to OAuth 2.0 Clients. See #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 6, 2018
aeneasr pushed a commit that referenced this issue Aug 6, 2018
aeneasr pushed a commit that referenced this issue Aug 6, 2018
This patch introduces field `subject_type` to OAuth 2.0 Clients. See #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 6, 2018
aeneasr pushed a commit that referenced this issue Aug 6, 2018
@michalwojciechowski
Copy link
Author

@arekkas I've just took a sneak peek at the referenced commits, it looks solid! I'm wondering which option you are going to follow in the first place? Supporting the "externally provided" subject passed directly to hydra?

@aeneasr
Copy link
Member

aeneasr commented Aug 6, 2018

The idea is to have both. One of the ideas is to have a resolver that associates the pairwise ID with the "real ID". It's a bit complicated though as many use cases (e.g. id_token_hint) need to be supported.

aeneasr pushed a commit that referenced this issue Aug 6, 2018
This patch introduces field `subject_type` to OAuth 2.0 Clients. See #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 6, 2018
aeneasr pushed a commit that referenced this issue Aug 6, 2018
@aeneasr
Copy link
Member

aeneasr commented Aug 6, 2018

This is kinda tricky to implement. The idea would be the following:

  1. Subject IDs in the session cookie, the consent and login payloads are always the "raw" subject ID.
  2. Subject IDs in the ID Token, Access and Refresh Token are the pairwise subject ID.

This means that on consecutive login flows you will always receive the "raw"/"true" subject ID. So the "internal" view will be untouched while the "external" / public view would be different depending on the subject algorithm.

@michalwojciechowski another idea is to add an endpoint like GET /oauth2/consent/pairwise/<pairwise-id> (naming tbd) which is capable of resolving the pairwise ID to the true subject ID.

Another idea is to return the true subject ID on token introspection.

Do you really need to be able to have full control over setting the subject ID? As far as I understood the thinking behind it was to be able to resolve the subject ID. This could be achieved with the ideas mentioned above.

aeneasr pushed a commit that referenced this issue Aug 6, 2018
@aeneasr
Copy link
Member

aeneasr commented Aug 6, 2018

The sub field is only impacted by this algorithm when the ID Token or the /userinfo endpoint are used. They do not impact OAuth 2.0 Token Introspection as that spec is separate from OIDC Core 1.0.

This in turn implies that only the ID Token and the userinfo payloads have to adhere to the pairwise algorithm. This in turn makes the process for the access/refresh tokens completely uninteresting.

I am assuming that you need to be able to resolve the sub field in the following scenario:

  1. 3rd party developer requests ID Token via OAuth2/OIDC
  2. 3rd party developer gets ID Token with pairwise sub
  3. 3rd party developer gets sub value through /userinfo or by parsing the ID Token
  4. 3rd party developer uses something like curl -x GET -h "Authorization: <access-token> /some-api/<value-from-sub-field>
  5. You as the provider need to map that pairwise subject ID to the real subject ID

Am I assuming this correctly?

@aeneasr
Copy link
Member

aeneasr commented Aug 6, 2018

One issue persists though. Assuming we are issuing a JSON Web Token as access token - the sub field will be visible to the client. In this case, the sub field of the access/refresh token has to be altered as well.

@aeneasr
Copy link
Member

aeneasr commented Aug 6, 2018

I checked how other libraries solve this. node-oidc-provider has a different subject ID in the introspection response too. I think they are wrong though, the introspection is really to fetch metadata about a subject from the PROVIDER side, not the consumer (client). Thus it doesn't make sense to have the pairwise subject in the introspection response. This gets muddier with JWTs where the sub is actually transparent to the clients (that's why I don't like JWT for access tokens...) but I think it's legitimate to (at least for now) disable JWTs with pairwise config. It's a bit dirty but then again JWTs as access tokens are dirty and it removes development overhead which makes hitting the deadline more realistic.

@aeneasr
Copy link
Member

aeneasr commented Aug 7, 2018

@michalwojciechowski @damian0o could you please comment on #950 (comment) ? Otherwise I'll move on with the implementation laid out in #950 (comment) #950 (comment)

aeneasr pushed a commit that referenced this issue Aug 7, 2018
This patch introduces field `subject_type` to OAuth 2.0 Clients. See #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 7, 2018
aeneasr pushed a commit that referenced this issue Aug 7, 2018
aeneasr pushed a commit that referenced this issue Aug 7, 2018
@damian0o
Copy link

damian0o commented Aug 8, 2018

I believe additional flow is that when 3rd party developer is calling token introspection endpoint and get full access token. Subjects between access token and id token should match.

@aeneasr
Copy link
Member

aeneasr commented Aug 8, 2018

I disagree. OAuth 2.0 Token Introspection has a clear definition of who's calling:

This specification defines a method for a protected resource * to query
an OAuth 2.0 authorization server to determine the active state of an
OAuth 2.0 token and to determine meta-information about this token.
OAuth 2.0 deployments can use this method to convey information about
the authorization context of the token from the authorization server
to the protected resource.

* protected resource = resource on the resource provider (your first-party API)

In OAuth 2.0 [RFC6749], the contents of tokens are opaque to clients.
This means that the client does not need to know anything about the
content or structure of the token itself, if there is any. However,
there is still a large amount of metadata that may be attached to a
token, such as its current validity, approved scopes, and information
about the context in which the token was issued. These pieces of
information are often vital to protected resources making
authorization decisions based on the tokens being presented. Since
OAuth 2.0 does not define a protocol for the resource server to learn
meta-information about a token that it has received from an
authorization server, several different approaches have been
developed to bridge this gap. These include using structured token
formats such as JWT [RFC7519] or proprietary inter-service
communication mechanisms (such as shared databases and protected
enterprise service buses) that convey token information.

This specification defines a protocol that allows authorized
protected resources to query the authorization server to determine
the set of metadata for a given token that was presented to them by
an OAuth 2.0 client.

Additionally, a protected
resource can use the mechanism described in this specification to
introspect the token in a particular authorization decision context
and ascertain the relevant metadata about the token to make this
authorization decision appropriately.

To me, it is very clear what the intention of OAuth 2.0 Token Introspection is. It is for your first-party resource provider (basically your API or API Gateway) to check if the access token is valid and if so, what metadata is associated with it.

For the resource server it makes no sense to have an obfuscated user id (sub). The obfuscated user id is only relevant for the outside view (namely 3rd party clients using your OIDC server to authenticate users).

@aeneasr
Copy link
Member

aeneasr commented Aug 8, 2018

By the way, this is also the reason why there is no public OAuth 2.0 Token Introspection endpoint Google, Microsoft, Amazon, Dropbox, Facebook, ... (you name it).

@damian0o
Copy link

damian0o commented Aug 8, 2018

Well played card with mentioning Google, Microsoft others. I believe some of our clients can also turn into 3rd party resources providers and in this case we would like to preserve this pairwise algorithm in access tokens as well.

@aeneasr
Copy link
Member

aeneasr commented Aug 8, 2018

If a 3rd party wants to expose their APIs to other 3rd parties (or 4th parties? 😃) they should have their own authorization infrastructure in place:

  1. A user clicks on "sign in with verimi" in application "mybank"
  2. User authenticates at the idp
  3. Application "mybank" gets an access, refresh, and ID token
  4. "mybank" validates ID token and verifies the claims. Here is the obfuscated user id. "mybank" now sets up internal metadata in a session cookie or whatever. The user is authenticated.
  5. "mybank" uses the access token at verimi to perform e.g. updates to the user profile or whatever

So in the case that "mybank" now exposes resource providers (let's just call that "mybank API" for now):

  1. A user clicks on "sign in with mybank" in application "3rd-party-app-mybank"
  2. User is redirected to mybank
  3. User authenticates via "sign in with verimi" (this is step 1 from above)
  4. User performs the flow etc until step 5 from above
  5. User is now authenticated at mybank
  6. "mybank" sends "3rd-party-app-mybank" access, refresh, id tokens
  7. "3rd-party-app-mybank" uses access token to do whatever on "mybank API" - for example see the last 10 transactions

From my understanding, there is no need for introspection here anywhere. Authentication is done via OIDC. Authorization is done (at the first party) via introspection. Authorization for the third party / relying party / client is done in the client and completely independent and isolated from your authorization infrastructure.

@aeneasr
Copy link
Member

aeneasr commented Aug 8, 2018

As discussed on the phone we'll add an additional field to the OAuth 2.0 Token Introspection obfuscated_sub (open for name suggestions) which contains the obfuscated user id. This way, your first-party server can obtain both cleartext and obfuscated IDs!

aeneasr pushed a commit that referenced this issue Aug 10, 2018
This patch introduces field `subject_type` to OAuth 2.0 Clients. See #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 10, 2018
aeneasr pushed a commit that referenced this issue Aug 10, 2018
aeneasr pushed a commit that referenced this issue Aug 10, 2018
This patch introduces the OpenID Connect pairwise Subject Identifier Algorithm.

Closes #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 10, 2018
This patch introduces the OpenID Connect pairwise Subject Identifier Algorithm.

Closes #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 10, 2018
This patch introduces field `subject_type` to OAuth 2.0 Clients. See #950

Signed-off-by: arekkas <[email protected]>
aeneasr pushed a commit that referenced this issue Aug 10, 2018
aeneasr pushed a commit that referenced this issue Aug 10, 2018
@damian0o
Copy link

Just a side note on Access token subject field. How this should work with OAUTH2_ACCESS_TOKEN_STRATEGY=jwt.

@aeneasr
Copy link
Member

aeneasr commented Aug 20, 2018

The JWT strategy does not support the pairwise algorithm at the moment. It's one of the problems which I have been advocating for a while (transparent access token metadata) with regards to JSON Web Tokens as access tokens, but now it's a real effect. There is currently no way of making this work with the JWT strategy because the access token state is regarded as internal. Internal resources however must be able to resolve the user ID from the access token which is not possible with a one-way hashing algorithm like the one we're using (SHA-256). So there is no way really to make this work.

tl;dr there is currently no way to make this work with JWTs natively in ORY Hydra. If you try to use both (JWT + Pairwise) then OAuth2 flows will fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants