Do not allow sending messages after sending a close frame #74

najamelan · 2019-09-12T01:20:32Z

The application MUST NOT send any more data frames after sending a Close frame.

src: The websocket RFC - section 5.5.1 Close

Tungstenite did not comply here. This PR fixes this. There is an integration test provided that guarantees that it works.

Closes: snapview/tokio-tungstenite#67

Note that this builds upon #72, there are only 3 commits in this PR, the others are from #72.

WARNING: This is a BREAKING CHANGE

daniel-abramov · 2019-09-12T14:56:08Z

src/protocol/mod.rs

@@ -231,7 +231,9 @@ impl WebSocketContext {
        Stream: Read + Write,
    {
        // Do not write to already closed connections.
-        self.state.check_active()?;
+        if !self.state.is_active() {
+            return Err(Error::AlreadyClosed);


AlreadyClosed has a different semantics at the moment. The idea was that ConnectionClosed is returned only once when the connection gets closed, whereas AlreadyClosed is the error which you receive iff you try to push messages to the WebSocket after getting ConnectionClosed (i.e. an attempt to work with the socket in a terminated state). According to @agalakhov that might be useful to identify bugs, i.e. if you get AlreadyClosed error in your code, you can treat it as a bug in your application (it tries to push frames to the already closed connection, even after being informed about that).

It might make sense to introduce a new error, something like SendingWhileClosing(Message). In this case we clearly inform user that it's prohibited to send new messages after initiating connection closure and also return a Message back.

I think the implementation at this place should looks something like this:

self.state.check_active()?; if !self.state.can_write() { return Err(Error::SendingWhileClosing(message)) }

what do you think?

AlreadyClosed has a different semantics at the moment. The idea was that ConnectionClosed is returned only once when the connection gets closed, whereas AlreadyClosed is the error which you receive iff you try to push messages to the WebSocket after getting ConnectionClosed (i.e. an attempt to work with the socket in a terminated state)

Tbh as a user of the library, this isn't clear at all. None of the methods on WebSocket have a comprehensive documentation about what errors might be returned, what they mean and what you should do about them. Even more so when using through tokio-tungstenite which adds another layer of indirection.

The docs for tungstenite::error say about AlreadyClosed:

Trying to work with already closed connection

Trying to write after receiving Message::Close or trying to read after receiving Error::ConnectionClosed causes this.

I would say that sending a close and then trying to send data is pretty much "Trying to work with already closed connection" given that the sender part of the connection is clearly considered closed according to the RFC. If it seems relevant that the user can recover their message in such case, maybe another error is needed in general for all failing writes, to give the message back. I personally have not yet had use for that.

It's possible to introduce more specific error types, but it might just make it more confusing. If we stick to AlreadyClosed, we could rephrase the second line of the documentation to include this scenario. I think it's also a bit confusing, because it says "Trying to write after receiving Message::Close".

Since tungstenite will automatically acknowledge the close from the remote, this is really also trying to write after sending a close, just the user didn't send it from their code. Technically it would be fine to send after receiving but before acknowledging the remote close, but tungstenite API does not allow that (which is fine I think).

"trying to read after receiving Error::ConnectionClosed": do you receive this error in the reader part or when trying to write? That's also a bit confusing. From doing a search in the source code, it turns out that:

write_message can return this, but only if the role is server

read_message can return this, but only if WebSocketState::ClosedByPeer | WebSocketState::CloseAcknowledged

send_one_frame will also return this if IoErrorKind::ConnectionReset happens.

that last one is called from write_pending, which get's called on both read and write. So it's pretty daunting for a user to figure out when what might happen actually, that's why it would be nice if it was documented. In my experience, if things get this complex, it's pretty daunting as the author of code to keep track what exactly happens when, so documenting+testing is as much for your own sanity as for that of the users.

In terms of the usefulness of ConnectionClosed vs AlreadyClosed, I have a hard time picturing all use cases. I use it through tokio-tungstenite and I can say that I just map both to returning None in a stream, and into std::io::NotConnected on the sink...

It questions the error handling in tungstenite as a whole. I imagine the idea behind ConnectionClosed is that receiving an error in a sender task is a good way to find out that a connection is closed. On the other hand, there are afaict 3 scenarios for closing:

I call close (don't need an error to know that I closed it)

remote closed (I receive CloseFrame on the reader end)

fatal error (underlying network + protocol?)

So the main situation here is 2, because 3 probably returns the fatal error instead of ConnectionClosed and 1. is obvious, just don't send after closing it yourself. In 2 it is indeed annoying that the "information" about the connection being closed is in the reader part, which might be a separate async task. It means you somehow have to tell your writer task to stop, requiring some sort of synchronization.

The thing is, it really depends on the app design. If my business logic is more tied to the receiving end, maybe my writer task just forwards all incoming messages from a channel to the websocket, but then when I receive a close frame, I just close the channel which will stop the loop of my writer and it will close.

But what if the channel for the writer is mpsc. Maybe several components of the software might be sending messages out. In that case it might be unpractical to tell them all to drop the channel, and maybe detecting ConnectionClosed on send might be more convenient. Note that in this case though the extra AlreadyClosed doesn't seem to add much value either.

In some way, the two cases seem to be:

WriteClosed: you initiated the close handshake (by calling close or manually sending a CloseFrame) or we received the remote initiated the close handshake and tungstenite already acknowledged it

ReadClosed: the remote send a CloseFrame, and thus no other messages will ever come in from them.

So with all that, I don't have a definite solution to propose. Hopefully the feedback helps the thought process on your end.

Just as a sidenote, another way to detect events might be out of band. I tend to find it a nicer API. Let the object be observable and send an event on (fatal) errors and connection close, so every component in a software can listen for those events if they care and act on them. It might not be a good fit for tungstenite because it is quite low level, but it might be nice for tokio-tungstenite. I wrote the pharos crate for this. It's light on features for now but I will add things like filtering events with a predicate if you only care about some of them.

Looking at it again, I feel the correct error type to return here might be Protocol? Given that ConnectionClosed and AlreadyClosed don't return the message to the user either.

What do you think?

Yes, that looks good for me.

I updated the PR with this proposition

If remote initiated close and we have not yet received a frame from remote...

I meant if we have received it, but our response is still sitting in the send queue. It is sound as long as all user facing API's consider it as if we had already sent it out, since any new user messages will end up behind it in the queue. However all internal code must be careful to not consider the close handshake finished, since it isn't really and we should never set state to terminated and return ConnectionClosed before it's actually sent out. That works correctly now, but people modifying the code in the future should be aware of this, and it's not really documented anywhere.

As far as the Protocol error now appearing in write_message, it might break client code because they might be sending messages in a loop and matching for SendQueueFull and ConnectionClosed, however now they might get Protocol. So it's a breaking change and I would mention this in release notes.

The test is failing. I'm working on it, don't merge.

Ok, tests pass.

There is a scenario to be aware of:

Remote initiate close handshake

we receive it and cue a response

we try to send -> protocol error

client code here must still do calls to write_pending until the close response is out. If the connection get's dropped right after the protocol error, the other side will receive a Protocol( connection reset ) error.

There is a scenario to be aware of:

The step (3) in such client code seems to be a bug on the client side, right? (the client tries to send a message after receiving a close message).

The thing is you might have sender and reader tasks running concurrently, so unless you have a channel between both in order to immediately intercept the writer, the error messages returned on the writer are meaningful for detecting connection closes and such.

I haven't looked at the code using tungstenite in the wild but I would assume people use the errors for that information to avoid the extra synchronization.

src/protocol/mod.rs

…l error. The websocket RFC explicitly states this is not allowed.

daniel-abramov

Thanks.

FYI @agalakhov : this concludes important changes done by @najamelan in the last couple PRs. I wrote a short summary for you:

Previously the user could initiate a close (by sending a close frame to the remote) and then tungstenite-rs did not prohibit this user to still call write_message() and send further messages (bug, according to RFC it must be prohibited). Now, an attempt to send after close will result in protocol error.
The following scenario: a peer sends a close message to the remote, the remote sends some message(s) to the peer at exactly the same time (i.e. before it gets the close message from the peer). Result: this incoming message is completely lost on peer's side. This is not a protocol violation and the RFC is not strict about that. Our previous implementation just ignored such frames (although these are valid), the current implementation will still return such frame to the peer, so the peer gets that message from the server (more logical behavior). Note: not to confuse with another scenario of sending messages after close: if the remote sends messages after acknowledging close, we of course get an error on our side that the remote violates the protocol (previously such things were simply ignored).

najamelan mentioned this pull request Sep 12, 2019

(tokio-)tungstenite allows sending data after the connection is closed snapview/tokio-tungstenite#67

Closed

najamelan force-pushed the bugfix/no_send_after_close branch from 87f825b to fc48942 Compare September 12, 2019 14:38

daniel-abramov requested changes Sep 12, 2019

View reviewed changes

najamelan force-pushed the bugfix/no_send_after_close branch 3 times, most recently from 97ec131 to e3f77f4 Compare September 12, 2019 20:21

najamelan mentioned this pull request Sep 13, 2019

Don't swallow ping/pong while waiting for close acknowledgement #75

Merged

najamelan force-pushed the bugfix/no_send_after_close branch from e3f77f4 to 2558579 Compare September 13, 2019 23:22

This was referenced Sep 14, 2019

Review closing the connection #77

Closed

WIP: Add some documentation to WebSocket and Error. #79

Merged

najamelan force-pushed the bugfix/no_send_after_close branch from 2558579 to 0a667e6 Compare September 19, 2019 18:04

najamelan added 3 commits September 19, 2019 20:32

integration test for no_send_after_close

2004309

No longer accept to send after close

2918eec

Consider remote sending a frame after sending a close frame a protoco…

1ee3f34

…l error. The websocket RFC explicitly states this is not allowed.

najamelan force-pushed the bugfix/no_send_after_close branch from 0a667e6 to 1ee3f34 Compare September 19, 2019 18:32

najamelan requested a review from daniel-abramov September 19, 2019 18:38

daniel-abramov approved these changes Sep 20, 2019

View reviewed changes

daniel-abramov merged commit 0da592a into snapview:master Sep 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not allow sending messages after sending a close frame #74

Do not allow sending messages after sending a close frame #74

najamelan commented Sep 12, 2019 •

edited

Loading

daniel-abramov Sep 12, 2019

daniel-abramov Sep 12, 2019

daniel-abramov Sep 12, 2019

najamelan Sep 12, 2019 •

edited

Loading

najamelan Sep 13, 2019 •

edited

Loading

najamelan Sep 19, 2019 •

edited

Loading

najamelan Sep 19, 2019

najamelan Sep 19, 2019

daniel-abramov Sep 20, 2019

najamelan Sep 20, 2019

daniel-abramov left a comment

Do not allow sending messages after sending a close frame #74

Do not allow sending messages after sending a close frame #74

Conversation

najamelan commented Sep 12, 2019 • edited Loading

daniel-abramov Sep 12, 2019

Choose a reason for hiding this comment

daniel-abramov Sep 12, 2019

Choose a reason for hiding this comment

daniel-abramov Sep 12, 2019

Choose a reason for hiding this comment

najamelan Sep 12, 2019 • edited Loading

Choose a reason for hiding this comment

najamelan Sep 13, 2019 • edited Loading

Choose a reason for hiding this comment

najamelan Sep 19, 2019 • edited Loading

Choose a reason for hiding this comment

najamelan Sep 19, 2019

Choose a reason for hiding this comment

najamelan Sep 19, 2019

Choose a reason for hiding this comment

daniel-abramov Sep 20, 2019

Choose a reason for hiding this comment

najamelan Sep 20, 2019

Choose a reason for hiding this comment

daniel-abramov left a comment

Choose a reason for hiding this comment

najamelan commented Sep 12, 2019 •

edited

Loading

najamelan Sep 12, 2019 •

edited

Loading

najamelan Sep 13, 2019 •

edited

Loading

najamelan Sep 19, 2019 •

edited

Loading