Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not allow sending messages after sending a close frame #74

Merged
merged 3 commits into from
Sep 20, 2019

Conversation

najamelan
Copy link
Contributor

@najamelan najamelan commented Sep 12, 2019

The application MUST NOT send any more data frames after sending a Close frame.

src: The websocket RFC - section 5.5.1 Close

Tungstenite did not comply here. This PR fixes this. There is an integration test provided that guarantees that it works.

Closes: snapview/tokio-tungstenite#67

Note that this builds upon #72, there are only 3 commits in this PR, the others are from #72.

WARNING: This is a BREAKING CHANGE

@@ -231,7 +231,9 @@ impl WebSocketContext {
Stream: Read + Write,
{
// Do not write to already closed connections.
self.state.check_active()?;
if !self.state.is_active() {
return Err(Error::AlreadyClosed);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AlreadyClosed has a different semantics at the moment. The idea was that ConnectionClosed is returned only once when the connection gets closed, whereas AlreadyClosed is the error which you receive iff you try to push messages to the WebSocket after getting ConnectionClosed (i.e. an attempt to work with the socket in a terminated state). According to @agalakhov that might be useful to identify bugs, i.e. if you get AlreadyClosed error in your code, you can treat it as a bug in your application (it tries to push frames to the already closed connection, even after being informed about that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to introduce a new error, something like SendingWhileClosing(Message). In this case we clearly inform user that it's prohibited to send new messages after initiating connection closure and also return a Message back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the implementation at this place should looks something like this:

self.state.check_active()?;
if !self.state.can_write() {
    return Err(Error::SendingWhileClosing(message))
}

what do you think?

Copy link
Contributor Author

@najamelan najamelan Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AlreadyClosed has a different semantics at the moment. The idea was that ConnectionClosed is returned only once when the connection gets closed, whereas AlreadyClosed is the error which you receive iff you try to push messages to the WebSocket after getting ConnectionClosed (i.e. an attempt to work with the socket in a terminated state)

Tbh as a user of the library, this isn't clear at all. None of the methods on WebSocket have a comprehensive documentation about what errors might be returned, what they mean and what you should do about them. Even more so when using through tokio-tungstenite which adds another layer of indirection.

The docs for tungstenite::error say about AlreadyClosed:

Trying to work with already closed connection

Trying to write after receiving Message::Close or trying to read after receiving Error::ConnectionClosed causes this.

I would say that sending a close and then trying to send data is pretty much "Trying to work with already closed connection" given that the sender part of the connection is clearly considered closed according to the RFC. If it seems relevant that the user can recover their message in such case, maybe another error is needed in general for all failing writes, to give the message back. I personally have not yet had use for that.

It's possible to introduce more specific error types, but it might just make it more confusing. If we stick to AlreadyClosed, we could rephrase the second line of the documentation to include this scenario. I think it's also a bit confusing, because it says "Trying to write after receiving Message::Close".

Since tungstenite will automatically acknowledge the close from the remote, this is really also trying to write after sending a close, just the user didn't send it from their code. Technically it would be fine to send after receiving but before acknowledging the remote close, but tungstenite API does not allow that (which is fine I think).

"trying to read after receiving Error::ConnectionClosed": do you receive this error in the reader part or when trying to write? That's also a bit confusing. From doing a search in the source code, it turns out that:

  • write_message can return this, but only if the role is server
  • read_message can return this, but only if WebSocketState::ClosedByPeer | WebSocketState::CloseAcknowledged
  • send_one_frame will also return this if IoErrorKind::ConnectionReset happens.
  • that last one is called from write_pending, which get's called on both read and write. So it's pretty daunting for a user to figure out when what might happen actually, that's why it would be nice if it was documented. In my experience, if things get this complex, it's pretty daunting as the author of code to keep track what exactly happens when, so documenting+testing is as much for your own sanity as for that of the users.

In terms of the usefulness of ConnectionClosed vs AlreadyClosed, I have a hard time picturing all use cases. I use it through tokio-tungstenite and I can say that I just map both to returning None in a stream, and into std::io::NotConnected on the sink...

It questions the error handling in tungstenite as a whole. I imagine the idea behind ConnectionClosed is that receiving an error in a sender task is a good way to find out that a connection is closed. On the other hand, there are afaict 3 scenarios for closing:

  1. I call close (don't need an error to know that I closed it)
  2. remote closed (I receive CloseFrame on the reader end)
  3. fatal error (underlying network + protocol?)

So the main situation here is 2, because 3 probably returns the fatal error instead of ConnectionClosed and 1. is obvious, just don't send after closing it yourself. In 2 it is indeed annoying that the "information" about the connection being closed is in the reader part, which might be a separate async task. It means you somehow have to tell your writer task to stop, requiring some sort of synchronization.

The thing is, it really depends on the app design. If my business logic is more tied to the receiving end, maybe my writer task just forwards all incoming messages from a channel to the websocket, but then when I receive a close frame, I just close the channel which will stop the loop of my writer and it will close.

But what if the channel for the writer is mpsc. Maybe several components of the software might be sending messages out. In that case it might be unpractical to tell them all to drop the channel, and maybe detecting ConnectionClosed on send might be more convenient. Note that in this case though the extra AlreadyClosed doesn't seem to add much value either.

In some way, the two cases seem to be:

  • WriteClosed: you initiated the close handshake (by calling close or manually sending a CloseFrame) or we received the remote initiated the close handshake and tungstenite already acknowledged it
  • ReadClosed: the remote send a CloseFrame, and thus no other messages will ever come in from them.

So with all that, I don't have a definite solution to propose. Hopefully the feedback helps the thought process on your end.

Just as a sidenote, another way to detect events might be out of band. I tend to find it a nicer API. Let the object be observable and send an event on (fatal) errors and connection close, so every component in a software can listen for those events if they care and act on them. It might not be a good fit for tungstenite because it is quite low level, but it might be nice for tokio-tungstenite. I wrote the pharos crate for this. It's light on features for now but I will add things like filtering events with a predicate if you only care about some of them.

Copy link
Contributor Author

@najamelan najamelan Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it again, I feel the correct error type to return here might be Protocol? Given that ConnectionClosed and AlreadyClosed don't return the message to the user either.

What do you think?

Copy link
Contributor Author

@najamelan najamelan Sep 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that looks good for me.

I updated the PR with this proposition

If remote initiated close and we have not yet received a frame from remote...

I meant if we have received it, but our response is still sitting in the send queue. It is sound as long as all user facing API's consider it as if we had already sent it out, since any new user messages will end up behind it in the queue. However all internal code must be careful to not consider the close handshake finished, since it isn't really and we should never set state to terminated and return ConnectionClosed before it's actually sent out. That works correctly now, but people modifying the code in the future should be aware of this, and it's not really documented anywhere.

As far as the Protocol error now appearing in write_message, it might break client code because they might be sending messages in a loop and matching for SendQueueFull and ConnectionClosed, however now they might get Protocol. So it's a breaking change and I would mention this in release notes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is failing. I'm working on it, don't merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, tests pass.

There is a scenario to be aware of:

  1. Remote initiate close handshake
  2. we receive it and cue a response
  3. we try to send -> protocol error
  4. client code here must still do calls to write_pending until the close response is out. If the connection get's dropped right after the protocol error, the other side will receive a Protocol( connection reset ) error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a scenario to be aware of:

The step (3) in such client code seems to be a bug on the client side, right? (the client tries to send a message after receiving a close message).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is you might have sender and reader tasks running concurrently, so unless you have a channel between both in order to immediately intercept the writer, the error messages returned on the writer are meaningful for detecting connection closes and such.

I haven't looked at the code using tungstenite in the wild but I would assume people use the errors for that information to avoid the extra synchronization.

@najamelan najamelan force-pushed the bugfix/no_send_after_close branch from 0a667e6 to 1ee3f34 Compare September 19, 2019 18:32
Copy link
Member

@daniel-abramov daniel-abramov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

FYI @agalakhov : this concludes important changes done by @najamelan in the last couple PRs. I wrote a short summary for you:

  1. Previously the user could initiate a close (by sending a close frame to the remote) and then tungstenite-rs did not prohibit this user to still call write_message() and send further messages (bug, according to RFC it must be prohibited). Now, an attempt to send after close will result in protocol error.
  2. The following scenario: a peer sends a close message to the remote, the remote sends some message(s) to the peer at exactly the same time (i.e. before it gets the close message from the peer). Result: this incoming message is completely lost on peer's side. This is not a protocol violation and the RFC is not strict about that. Our previous implementation just ignored such frames (although these are valid), the current implementation will still return such frame to the peer, so the peer gets that message from the server (more logical behavior). Note: not to confuse with another scenario of sending messages after close: if the remote sends messages after acknowledging close, we of course get an error on our side that the remote violates the protocol (previously such things were simply ignored).

@daniel-abramov daniel-abramov merged commit 0da592a into snapview:master Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants