Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow custom gql-on-ws protocol for subscriptions #1348

Merged
merged 1 commit into from
Jul 12, 2021

Conversation

asmeikal
Copy link
Contributor

@asmeikal asmeikal commented Jan 26, 2021

This could close #1341 for the moment, and provide a way to use graphql-ws instead of subscriptions-transport-ws without breaking existing NestJS services that already use subscriptions.

PR Checklist

Please check if your PR fulfills the following requirements:

PR Type

What kind of change does this PR introduce?

[ ] Bugfix
[x] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Build related changes
[ ] CI related changes
[ ] Other... Please describe:

What is the current behavior?

Issue Number: #1341

As described in the issue, the GraphQL module currently relies on subscriptions-transport-ws, which has several issues and is being deprecated in favor of graphql-ws. Apollo will change the gql-on-ws protocol in the future, but I don't believe subscriptions-transport-ws to be production ready, and cannot wait for Apollo to switch protocols nor renounce to NestJS.

What is the new behavior?

It's now possible to specify a protocol when passing subscriptions options to the GraphQL module.
The new protocol is graphql-ws, and has a few differences in the signature of the callback that the user can provide.
The signatures match the ones documented in graphql-ws.

Does this PR introduce a breaking change?

[ ] Yes
[x] No

Other information

The GraphqlWsSubscriptionService is taken from the example implementation provided by graphql-ws. The purpose of this class is to connect the WebSocket server, the graphql-ws server, and Apollo. The main difference with the example implementation is in how errors are handled.

The GraphqlWsException class can be used to close the WebSocket connection with a specific code and reason, as suggested in the graphql-ws documentation to implement custom authentication protocols (see here). The onConnect callback can use this exception to signal errors to the client, and the client may behave differently based on the error, e.g., it may refresh the authentication token if it is expired. The tests cover this part.

The graphql-ws server is highly customizable via callbacks, that are called by the protocol in different phases of the subscription. Most of these are quite advanced and "low level", and allow manipulation of the ExecutionArgs or of the protocol message, so I only included the onConnect, onDisconnect, and onClose ones in the GraphQL module subscriptions configuration.

A catch: while the new library is called graphql-ws, its WebSocket sub-protocol is called graphql-transport-ws, and the WebSocket sub-protocol of subscriptions-transport-ws is called graphql-ws! 😞 Maybe the protocol option passed to the subscriptions configuration should be named after the actual WebSocket sub-protocol, and not after the library, to avoid confusion.

@Be-P
Copy link

Be-P commented Jan 27, 2021

I was reading the PR and it looks good to me, It would be great to be able to switch to the graphql-ws library implementation.

The subscriptions configuration now accepts a custom `protocol`,
to change the gql-on-ws protocol from the default `subscriptions-transport-ws.
@asmeikal
Copy link
Contributor Author

Hi @kamilmysliwiec, I've finally managed to do some thorough testing of this PR with the subscription services I'm working on, and they seem to work without problems (after upgrading the clients). I have tested this with an Electron application using React, and with two React Native applications. I was able to migrate the authentication (performed in the onConnect callback) with little effort, and everything else keeps working without changes. I've rebased the PR with this repo's master branch. I'm available to do some additional changes if necessary.

@aurimukas
Copy link

Hey !
Any news for merging PR to default repo ??

@jmcdo29
Copy link
Member

jmcdo29 commented May 16, 2021

Apologies for the delay. Currently we're focused on getting v8 out the door. Once it's published we will work on merging in PRs again. Kamil has been very busy with getting this going.

@aurimukas
Copy link

Apologies for the delay. Currently we're focused on getting v8 out the door. Once it's published we will work on merging in PRs again. Kamil has been very busy with getting this going.

Thanks for the answer.

@kamilmysliwiec kamilmysliwiec changed the base branch from master to 9.0.0 July 12, 2021 13:14
@kamilmysliwiec kamilmysliwiec merged commit 492a1e8 into nestjs:9.0.0 Jul 12, 2021
@kamilmysliwiec
Copy link
Member

LGTM!

@ddehghan
Copy link

ddehghan commented Jul 13, 2021

Hi, Given that Apollo 3.0 has removed subscriptions-transport-ws completely, we are stuck. We would like to upgrade to graphql-ws but cant wait until nestjs 9.0 to be release. When do you expect that to go live? probably a year from now?

any chance you can backport this to 8.0?

@kamilmysliwiec
Copy link
Member

I'll release v9 (exclusively for @nestjs/graphql) sooner to not keep everyone locked with v2.

@asmeikal
Copy link
Contributor Author

Hi @kamilmysliwiec! Thanks for merging this!

graphql-ws has received a major upgrade recently (see enisdenjo/graphql-ws#117) that makes the version used in this PR not compatible with the latest version of the client. It should be enough to upgrade the dependency.

When we migrated from subscriptions-transport-ws to graphql-ws we ended up with something slightly different from this PR: we had to support both protocols, and ended up implementing something like this https://github.com/enisdenjo/graphql-ws#ws-backwards-compat. Maybe it's possible to do something similar here too? What is your roadmap for v9?

@kamilmysliwiec
Copy link
Member

kamilmysliwiec commented Jul 13, 2021

@asmeikal this sounds great! Would you like to create a PR for this (both dep upgrade + backward compatibility)?

I'm hoping to get v9 ready asap to unblock people that want to use Apollo v3. I'll release it separately from other packages :)

UPDATE: I've done some groundwork for the backward compatibility already (9.0.0 branch). I had to remove a dedicated error class added in this PR @asmeikel, let me know if you think there's a better way & approach! I gotta add more tests too

@asmeikal
Copy link
Contributor Author

Hi @kamilmysliwiec, graphql-ws suggests to rely on websocket error codes to handle application level errors. The 1011 error code currently used on the 9.0.0 branch is a fatal error code (see here for a complete list), and the client would stop reconnecting when receiving it. The purpose of the custom exception was to allow custom error handling on the application level, e.g., to signal that the authorization token is expired and must be refreshed, without touching the socket directly.

Do you still need help with integrating the new graphql-ws version? If not, I can start testing it with our app!

@kamilmysliwiec
Copy link
Member

The purpose of the custom exception was to allow custom error handling on the application level, e.g., to signal that the authorization token is expired and must be refreshed, without touching the socket directly

I removed the error handling layer because we no longer maintain the ws server internally but instead, we reuse the existing one (with useServer) and so there are no error and onMessage listeners (with try..catch blocks as in this PR). I figured that setting the error code manually (for example, context.extra.socket.close(4000, 'Missing authorization');) is sufficient but let me know if you think we should automatically wrap hooks like onConnect and hardcode error handling logic in there.

If not, I can start testing it with our app!

If you can give it a shot, that would be awesome!

@ddehghan
Copy link

OK we tested the 9.0.0-Next.2 package. it works. :-)

The main issue we have is the path is hardcoded to '/graphql' . we need that to be configurable so that we can map it to something different so that on our load balancer we can set different timeout for each path.

Haven't tested the exceptions yet. But will give that a try today

@kamilmysliwiec
Copy link
Member

kamilmysliwiec commented Jul 22, 2021

Thanks for the feedback @ddehghan! Would you like to create a PR to address this?

@ddehghan
Copy link

I did some more investigation:

Actually it seems that '/graphql' path has no effect only the domain portion is actually used.. The Client can request any url on the port and it gets connected. so there must be a bug somewhere else.

Because any url works we are able to set a different timeout of that load balancer so it works for us at the moment. web socket does support path. But I think the way they do it is they give you a call back on connection with the url and you are supposed to handle that. I don't think graphql-ws has implemented that so passing the path is a no-op.

@asmeikal
Copy link
Contributor Author

The purpose of the custom exception was to allow custom error handling on the application level, e.g., to signal that the authorization token is expired and must be refreshed, without touching the socket directly

I removed the error handling layer because we no longer maintain the ws server internally but instead, we reuse the existing one (with useServer) and so there are no error and onMessage listeners (with try..catch blocks as in this PR). I figured that setting the error code manually (for example, context.extra.socket.close(4000, 'Missing authorization');) is sufficient but let me know if you think we should automatically wrap hooks like onConnect and hardcode error handling logic in there.

If not, I can start testing it with our app!

If you can give it a shot, that would be awesome!

Hi there, got around to testing the 9.0.0 branch!

About the ws server, the useServer provided by graphql-ws is intended to be an example implementation (see here) that can and should be updated based on the requirements of the integration. The initial implementation in this PR did change the example implementation to only allow ws connections on /graphql, as per @ddehghan first comment. I believe that the changes in this commit 5641df0 caused ws connections to any path to be passed to the ws server.

About the compatibility with subscription-transport-ws, I think some parts are missing: here the subscription server should receive some onConnect/onDisconnect callbacks (to handle e.g. authentication), but they are missing. The big problem here is that they are typed really differently from the graphql-ws ones. Since this will be included in a major release, maybe the right approach would be to provide two keys in GqlModuleOptions: one for graphql-ws configuration and one for subscription-transport-ws configuration, and start only the servers for which the configuration was provided (neither, both, or just one of them). I can try and submit a PR for this change, if the interface I proposed is ok for you @kamilmysliwiec!

Last thing: about the error messages, I have strong opinions against using the socket directly. :) I believe that a custom WebSocket exception with a code for the WebSocket CloseEvent is the right choice, for a few reasons:

  • the custom exception could be handled in different ways by graphql-ws and subscription-transport-ws in a transparent way for the developer that needs to support both protocols: graphql-ws would send the CloseEvent code to the client (which is what graphql-ws suggests), and subscription-transport-ws would send the exception message or some other string error code to the client
  • if the developer has to use the socket directly, retrocompatibility with subscription-transport-ws is more difficult: the auth logic inside the onConnect would need to be heavily wrapped to support both protocols
  • it's more similar to other exceptions that Nest gives to the developer, like the HttpException that contains a status code

@kamilmysliwiec
Copy link
Member

I believe that the changes in this commit 5641df0 caused ws connections to any path to be passed to the ws server.

Great catch! Would you like to include a fix in your PR or do you want me to do so?

About the compatibility with subscription-transport-ws, I think some parts are missing: here the subscription server should receive some onConnect/onDisconnect callbacks (to handle e.g. authentication), but they are missing

🕐 todo

I can try and submit a PR for this change, if the interface I proposed is ok for you @kamilmysliwiec!

Sounds good!

Last thing: about the error messages, I have strong opinions against using the socket directly. :)

Totally agree with you. I'm against using the socket directly as well, I'm just afraid of putting too much magic/hardcoding custom logic & auto-wrapping callbacks in this package. As implementing such an error layer doesn't really require changes in the GraphQLModule and could be implemented on the project's side (by library consumers), I figured not having this available out-of-the-box shouldn't be a blocker.

Would you like to create 2 separate PRs to the v9 branch in which the latter one would contain errors layer-specific changes (so we can look at both separately)? 🙌

@asmeikal
Copy link
Contributor Author

Hi @kamilmysliwiec, I can try and put together two PRs for these changes! I hope to get at least one done by tomorrow!

@kamilmysliwiec
Copy link
Member

Amazing, thank you @asmeikal!

@simplenotezy
Copy link

simplenotezy commented Mar 31, 2022

Not sure if this is the right place to ask, but we just migrated from Nest 7 -> 8 and the subscriptions on the mobile apps are no longer working - possibly due to updating to @nestjs/graphql v10 and the deprecation of subscriptions-transport-ws.

Currently our subscriptions are no longer working. I've tried following https://docs.nestjs.com/graphql/subscriptions#enable-subscriptions-with-apollo-driver and added:

      subscriptions: {
        'subscriptions-transport-ws': true,
      },

However, subscriptions will still fail. I might be missing something.

I've also found https://www.apollographql.com/docs/apollo-server/migration/#reenabling-subscriptions but I am not sure if this is relevant for Nest.js projects that use the @nestjs/graphql package, given the backwards compatibility you made with this PR @asmeikal

Any pointers in the right direction would be greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[discussion]: Custom GraphQL over WebSocket protocol when enabling subscriptions
7 participants