[QUIC] Design of Shutdown for QuicStreams in System.Net.Quic #756

jkotalik · 2019-12-11T01:14:55Z

Problem

So today, we have an API for Quic which looks like the following:

    public sealed class QuicStream : System.IO.Stream
    {
        internal QuicStream() { }
        public override bool CanSeek => throw null;
        public override long Length => throw null;
        public override long Seek(long offset, System.IO.SeekOrigin origin) => throw null;
        public override void SetLength(long value) => throw null;
        public override long Position { get => throw null; set => throw null; }
        public override bool CanRead => throw null;
        public override bool CanWrite => throw null;
        public override void Flush() => throw null;
        public override int Read(byte[] buffer, int offset, int count) => throw null;
        public override void Write(byte[] buffer, int offset, int count) => throw null;
        public long StreamId => throw null;
        public void AbortRead() => throw null;
        public ValueTask ShutdownWriteAsync(System.Threading.CancellationToken cancellationToken = default) => throw null;
    }

Today, QuicStream derives from Stream, which is really nice for interop purposes with other things, allowing people to just read and write to and from the stream without thinking what the underlying implementation is.

When calling Dispose on the stream, that currently triggers an abortive shutdown sequence on the stream rather than graceful. To trigger a graceful shutdown, one needs to call ShutdownWriteAsync, which sends a FIN to the peer and waits for an ACK. However, people that would normally be using a stream wouldn't have access to ShutdownWriteAsync, hence all actions would be abortive by default.

Current thoughts

Ideally for the workflow we'd like to enable is something like:

using (Stream stream = connection.AcceptAsync()) // this is actually a quic stream
{
    Memory<byte> memory = new Memory<byte>(new byte[10]);
    int res = await stream.ReadAsync(memory);
    await stream.WriteAsync(memory);
}

The big question here is how do we make it so we can send a shutdown for a final write with a stream API outside of Dispose?

The first pivot would be whether we'd like QuicStream to derive from stream at all. If we'd like it to derive from System.IO.Stream, I think it would have to make Dispose/DisposeAsync do a graceful shutdown by default in that case, and expose Abortive versions of these functions on the QuicStream itself. That way, in the default case, using the stream APIs "just works". However, doing a graceful shutdown of a stream in Dispose has downsides. We would be doing a network call in Dispose, which would block. DisposeAsync also can't be canceled as well.

So let's talk about a world where QuicStream didn't derive from System.IO.Stream and we had a few liberties with the API. Potentially, we could change a few things:

Make WriteAsync take a flag which says if this is the final write.
Allow Dispose to be abortive.

Abortive Dispose wouldn't block at all, which is nice. However, we do need to be careful about how to supply an error code when aborting a stream. The error code that is used for abort is supposed to be "application specific and provided", meaning we'd need a default error code.

cc @dotnet/http3 and @nibanks I haven't really reached a conclusion based on this yet. I'd be interested in your thoughts.

scalablecory · 2019-12-11T02:09:01Z

Q: an abortive close also needs I/O, doesn't it? Is it managed behind the scenes, or does it also need to be async?

If we implement Stream -- and we probably should, because so many things can use it -- then Dispose and DisposeAsync should be graceful to match the expected behavior of Stream. We just need to live with that extra I/O, I think :/.

Lets look at NetworkStream.Dispose:

It performs a graceful shutdown, so your sample code works great.
It doesn't block, because of OS magic, unless you set linger options.
If you're calling it during an exception-triggered unwind and your protocol doesn't have ways to distinguish a graceful shutdown from a busted protocol, you're going to have a bad time.
- You need to set ownsSocket=false and manage shutdown yourself to get around this.

Spitballing... if we really want to keep QuicStream deriving from Stream, what we could do is enable an 'advanced' behavior like so:

using QuicStream stream = CreateUnidirectionalStream(abortive = true);

// if exception happens, we don't ever Shutdown() and so Dispose will be abortive.
DoSomethingThatCanThrow();

stream.Shutdown(SD_BOTH); // graceful shutdown, now Dispose is locked in to no-op.

jkotalik · 2019-12-11T02:23:14Z

an abortive close also needs I/O, doesn't it? Is it managed behind the scenes, or does it also need to be async?

It does IO, but you don't need to await as it's abortive and it's managed behind the scenes.

We just need to live with that extra I/O, I think :/.

So there could be another option which could make it so we don't synchronously block in the Dispose call. Let's say Dispose didn't actually complete the graceful shutdown and kept most of the state, and rather just started it. On the callback to the graceful shutdown completing, it would finish the callback and call stream close.

I really don't like this option because it could lose writes if the connection is closed immediately after calling Dispose. Closing the connection doesn't wait for all streams to gracefully close as well.

Spitballing... if we really want to keep QuicStream deriving from Stream, what we could do is enable an 'advanced' behavior like so:

My feeling is no matter what we'd have to expose another way to do async graceful shutdown on QuicStream. We can track the current state of QuicStream to check whether to noop in Dispose or not.

nibanks · 2019-12-11T02:23:20Z

IMO, the most important thing that needs to be ironed out is the dispose path here. Lots of focus seems to be on the write path, whether it should be graceful or abortive, but not much on the read path.

A very important point with QUIC is that all aborts (in the read or write direction) must have an application layer error code. Therefore they cannot happen in response to a simple Dispose call, since it takes no error code as argument. If this must is ever violated (say, by closing the stream without explicitly aborting first, or waiting on the graceful close to complete) then the entire connection must be immediately torn down.

Therefore the API must be designed so that it should be very hard to accidentally code up something that either doesn't abort both directions or doesn't gracefully complete them. I don't have a good solution to this that fits inside the Stream interface.

jkotalik · 2019-12-11T02:30:12Z

A very important point with QUIC is that all aborts (in the read or write direction) must have an application layer error code. Therefore they cannot happen in response to a simple Dispose call, since it takes no error code as argument. If this must is ever violated (say, by closing the stream without explicitly aborting first, or waiting on the graceful close to complete) then the entire connection must be immediately torn down.

Yeah, even in the case where Dispose is graceful by default, there are still conditions where Dispose may be in an abortive state (timeout, didn't read the entire read side, etc.). We'd probably need to supply a default error code that is application specific in that case: maybe as a options when creating a QuicStream on the client or creating the Listener on the Server.

geoffkizer · 2019-12-11T20:27:06Z

The first pivot would be whether we'd like QuicStream to derive from stream at all.

We want QuicStream to derive from Stream. There's a ton of code that works in terms of Stream, and user understand how Streams work.

But even if we decided that QuicStream didn't actually derive from Stream, we would just turn around and write a wrapper over QuicStream that implements Stream, which would induce all of these questions anyway. In other words, not deriving from Stream doesn't actually solve anything here.

geoffkizer · 2019-12-11T21:05:22Z

A very important point with QUIC is that all aborts (in the read or write direction) must have an application layer error code. Therefore they cannot happen in response to a simple Dispose call, since it takes no error code as argument.

I agree that this is an important point to consider.

In particular, I think we will need AbortRead(int errorCode) that causes the read side to abort and send a STOP_SENDING frame, and AbortWrite(int errorCode) that causes the write side to abort and send a RESET_STREAM frame.

That said, we're going to need to do something in Dispose if either the read side or write side hasn't completed yet. That means picking some error code and sending this with the STOP_SENDING frame. As @jkotalik suggested above, we can make this configurable if that makes sense, but even then there will need to be some default, likely 0. I think that's fine; if users need more than that, they should call AbortRead or AbortWrite themselves.

nibanks · 2019-12-11T21:31:18Z

but even then there will need to be some default, likely 0. I think that's fine

I cannot stress enough how absolutely not fine that is. The QUIC spec uses no uncertain terms that the application protocol must be the one to specify this value.

As a transport QUIC has no knowledge of this error code space. 0 might mean "everything was successful! Please continue purchasing that Ferrari." It might mean "the world has ended, please delete the user's account." Since this error code will be delivered to the peer, and the peer will see it explicitly as "your peer aborted the stream and chose to use error code X" the app layer must be the one to choose.

So I see only one of two options if Dispose is somehow called while the stream is still in use:

Just call StreamClose and MsQuic will terminate the entire connection.
Have a default value that the app must set before it got this far. Perhaps it's part of a constructor somewhere.

geoffkizer · 2019-12-11T21:55:03Z

@nibanks Is there some reason QUIC hasn't just defined 0 as a universal error code meaning "unknown" or something like that? This seems useful.

nibanks · 2019-12-11T21:57:59Z

The entire error space is explicitly application owned. QUIC has no insights to the values. There are no default values. Feel free to open an issue on the QUIC spec. I opened one related to this, and any changes were shot down: quicwg/base-drafts#3291

geoffkizer · 2019-12-11T22:10:17Z

Thanks @nibanks. I commented on the linked issue.

scalablecory · 2019-12-11T23:41:56Z

not deriving from Stream doesn't actually solve anything here.

I'm beginning to worry that, given QUIC's needs for special handling, exposing Stream as its first-class API might cause incorrect usage expectations.

MikeBishop · 2019-12-17T14:30:58Z

Thinking about this from an application's point of view and looking at your interface, this seems to be an artifact of duplicating stream state tracking between the app and the transport stack. The root problem is that they can get out of sync, and garbage collection / the Dispose() method is supposed to reconcile the differences. Maybe the right solution is to have a different reconciliation method.

Suppose that the QuicConnection object exposed a collection of all open streams. No open stream is ever inaccessible until the connection is gone, and the connection object / stack becomes the arbiter of truth for stream state. Then you never need to auto-close streams separate from the connection. If the connection gets disposed, it takes all the streams with it.

Not to mention that being able to access that collection, filtered by stream type, permits some useful interactions:

Find a stream by ID if it's open
Check whether a stream is open
Determine how many concurrent streams are open
Compare stream objects to collections held by the app to identify idle streams and take some action (like closing them gracefully or returning them to a pool of available streams)

jkotalik · 2019-12-19T18:02:01Z

No open stream is ever inaccessible until the connection is gone, and the connection object / stack becomes the arbiter of truth for stream state. Then you never need to auto-close streams separate from the connection. If the connection gets disposed, it takes all the streams with it.

This seems like it could be dangerous. What happens if a peer opens and closes a ton of streams (obeying the limits for the count of unidirectional and bidirectional streams)? If we don't remove state for an individual stream, we wouldn't clean it up until the connection is closed. So I think we need to logically know when a stream is "done".

I do think we will eventually have a lookup mechanism for streams on a given connection, however I do think we need to remove these entries.

jkotalik · 2019-12-19T18:11:27Z

Related to my opening issue, we have decided to do the following for QuicStream.

public ValueTask WriteAsync(ReadOnlyMemory<byte> data, bool endStream, System.Threading.CancellationToken cancellationToken = default) => throw null;

public ValueTask ShutdownWriteCompleted(System.Threading.CancellationToken cancellationToken = default) => throw null;

If endStream = true, the call to WriteAsync will start a graceful shutdown sequence. Awaiting this task doesn't await the final ACK though, instead you need to await ShutdownWriteCompleted.

MikeBishop · 2019-12-19T19:55:43Z

This seems like it could be dangerous. What happens if a peer opens and closes a ton of streams (obeying the limits for the count of unidirectional and bidirectional streams)? If we don't remove state for an individual stream, we wouldn't clean it up until the connection is closed. So I think we need to logically know when a stream is "done".

I agree, you absolutely need to know when a stream is done. The peer can only generate streams up to the limit provided by the local endpoint, so it's obviously bounded (depending on what logic you're using to issue more Stream ID credit). Thinking about the receive stream state machine, the collection should contain anything that has opened but hasn't reached one of the two terminal states (Data Read or Reset Read).

karelz · 2020-02-20T23:43:33Z

Triage: Looks like we have something in place already. Maybe not final, but likely good enough for 5.0.

scalablecory · 2020-09-23T21:22:15Z

I think this should be something like:

class QuicStream
{
   // graceful shutdown to mimic how a user would expect a Stream to function.
   public ValueTask DisposeAsync()
   {
       ShutdownWrites();
       await ShutdownWritesComplete;
   }

   public void Dispose()
   {
      DisposeAsync().AsTask().GetAwaiter().GetResult();
   }

    // equiv to Write()+ShutdownWrites() but will use a single frame/packet.
    public ValueTask WriteAsync(...., bool shutdownWrites);

    // equiv to shutdown(SD_SEND);
    // should cause any subsequent writes to throw... something.
    public void ShutdownWrites();

    // abortive -- causing any unsent data to be discarded.
    // one direction can be aborted while leaving the other direction open.
    // should cause any outstanding ops to throw StreamAbortedException.
    public void Abort(Direction{Read,Write,Both}, long errorCode);

    // find some better name here.
    // causes MsQuic to immediately, without blocking or waiting for an ack from peer, dispose of any resources.
    // StreamComplete will be set immediately.
    public void AbortAndDispose();

    // when peer has acknowledged we've shutdown or aborted our own writes.
    // Nick, for a graceful shutdown, this should mean all the written data has been ACKed as well right? the FIN packet being received out of order by peer won't cause this to trigger until all data has been received?
    // corresponds to QUIC_STREAM_EVENT_SEND_SHUTDOWN_COMPLETE.
    public Task ShutdownWritesComplete { get; }

    // peer has acknowledged we've shutdown, and we have acknowledged peer has shutdown. stream is ready for disposal.
    // corresponds to QUIC_STREAM_EVENT_SHUTDOWN_COMPLETE.
    public Task StreamComplete { get; }
}

@nibanks any thoughts here?

geoffkizer · 2021-04-08T20:19:22Z

IDisposable2

Possibly. That opens up a whole other can of worms, though.

scalablecory · 2021-04-20T16:53:34Z

class QuicStream
{
    // Note: removing shutdown feature as it was part of #43290.
    // public ValueTask CompleteWritesAync();

   public ValueTask CloseAsync(CancellationToken cancellationToken);
   public ValueTask DisposeAsync() => CloseAsync(default);

    // abortive -- causing any unsent data to be discarded.
    // one direction can be aborted while leaving the other direction open.
    // should cause any outstanding ops to throw StreamAbortedException.
    public void Abort(long errorCode, QuicAbortDirection direction = QuicAbortDirection.Both);
}

[Flags]
enum QuicAbortDirection
{
   Read = 1,
   Write = 2,
   Both = 3
}

Usage example:

using QuicStream stream = ...;

try
{
   // make reads and writes here, and finally:
   await stream.CloseAsync(cancellationToken);
}
catch(OperationCanceledException)
{
   stream.Abort(errorCode);
}

ManickaP · 2021-07-15T17:28:42Z

Triage: we have the functionality to achieve graceful and graceless closure/dispose of the stream. This is just changing/renaming the API. This is 7.0.

wegylexy · 2021-11-13T06:45:52Z

If not Stream, how about PipeReader/PipeWriter?

ManickaP · 2022-04-22T07:25:28Z

From https://github.com/dotnet/runtime/pull/68288/files/31623532e423818c011bf0ddc267216e7526f7a5#r855163759, we cannot ever assume any error code, they always have to come from an application layer. Either as a parameter or as some default settings. Problematic part:

runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs

Line 873 in a806647

StartShutdown(QUIC_STREAM_SHUTDOWN_FLAGS.ABORT_RECEIVE, 0xffffffff);

wegylexy · 2022-04-22T12:50:25Z

An application may want to specify a default error code for the default disposing behavior.

await using var myStream = myConnection.OpenBidirectionalStream(defaultError: 1);
try
{
    // do normal stuff
    // await shutdown normally here!
}
catch (MySpecialException ex)
{
    myStream.AbortRead(ex.Code);
}
// default to 1 if still open

Otherwise, the entire connection should be aborted as the application is leaking a stream out for possible finalizer thread blocking during garbage collection or other undefined consequences.

ManickaP · 2022-06-14T12:05:26Z

Closing, covered by #69675

[main] Update dependencies from dotnet/arcade - Bump to newer Arcade

Dotnet-GitSync-Bot added area-System.IO untriaged New issue has not been triaged by the area owner labels Dec 11, 2019

jkotalik added area-System.Net and removed area-System.IO labels Dec 11, 2019

jkotalik added this to the 5.0 milestone Dec 11, 2019

karelz modified the milestones: 5.0, Future Feb 20, 2020

karelz added enhancement Product code improvement that does NOT require public API changes/additions and removed untriaged New issue has not been triaged by the area owner labels Feb 21, 2020

karelz changed the title ~~Design of Shutdown for QuicStreams in System.Net.Quic~~ [QUIC] Design of Shutdown for QuicStreams in System.Net.Quic Mar 11, 2020

scalablecory modified the milestones: Future, 6.0.0 Sep 23, 2020

karelz assigned scalablecory Apr 27, 2021

This was referenced May 18, 2021

[QUIC] Optimize stream reads #42651

Closed

QuicStream shutdown #52929

Closed

karelz unassigned scalablecory Jun 24, 2021

wfurt mentioned this issue Jun 27, 2021

close quic streams on Dispose #54798

Merged

ManickaP modified the milestones: 6.0.0, 7.0.0 Jul 15, 2021

ManickaP mentioned this issue Jul 16, 2021

notes #55666

Closed

samsp-msft mentioned this issue Aug 17, 2021

[HTTP/3] Developers can build libraries for HTTP/3 to meet their client/server needs (.NET 7) #57600

Closed

28 tasks

ManickaP mentioned this issue Sep 20, 2021

[QUIC] Review comments #58608

Closed

karelz assigned ManickaP Nov 16, 2021

ManickaP mentioned this issue Apr 22, 2022

[QUIC] Adopted msquic generated interop #68288

Merged

ManickaP mentioned this issue May 23, 2022

[API Proposal]: [QUIC] QuicStream #69675

Closed

ManickaP closed this as completed Jun 14, 2022

bentoi mentioned this issue Jun 17, 2022

Make INetworkConnection Disposable and AsyncDisposable icerpc/icerpc-csharp#1400

Closed

radical pushed a commit to radical/runtime that referenced this issue Jul 7, 2022

[main] Update dependencies from dotnet/arcade (dotnet#756)

18ae4a1

[main] Update dependencies from dotnet/arcade - Bump to newer Arcade

ghost locked as resolved and limited conversation to collaborators Jul 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUIC] Design of Shutdown for QuicStreams in System.Net.Quic #756

[QUIC] Design of Shutdown for QuicStreams in System.Net.Quic #756

jkotalik commented Dec 11, 2019

scalablecory commented Dec 11, 2019 •

edited

Loading

jkotalik commented Dec 11, 2019

nibanks commented Dec 11, 2019

jkotalik commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

nibanks commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

nibanks commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

scalablecory commented Dec 11, 2019

MikeBishop commented Dec 17, 2019 •

edited

Loading

jkotalik commented Dec 19, 2019

jkotalik commented Dec 19, 2019

MikeBishop commented Dec 19, 2019

karelz commented Feb 20, 2020

scalablecory commented Sep 23, 2020 •

edited

Loading

geoffkizer commented Apr 8, 2021

scalablecory commented Apr 20, 2021

ManickaP commented Jul 15, 2021

wegylexy commented Nov 13, 2021

ManickaP commented Apr 22, 2022

wegylexy commented Apr 22, 2022

ManickaP commented Jun 14, 2022

[QUIC] Design of Shutdown for QuicStreams in System.Net.Quic #756

[QUIC] Design of Shutdown for QuicStreams in System.Net.Quic #756

Comments

jkotalik commented Dec 11, 2019

Problem

Current thoughts

scalablecory commented Dec 11, 2019 • edited Loading

jkotalik commented Dec 11, 2019

nibanks commented Dec 11, 2019

jkotalik commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

nibanks commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

nibanks commented Dec 11, 2019

geoffkizer commented Dec 11, 2019

scalablecory commented Dec 11, 2019

MikeBishop commented Dec 17, 2019 • edited Loading

jkotalik commented Dec 19, 2019

jkotalik commented Dec 19, 2019

MikeBishop commented Dec 19, 2019

karelz commented Feb 20, 2020

scalablecory commented Sep 23, 2020 • edited Loading

geoffkizer commented Apr 8, 2021

scalablecory commented Apr 20, 2021

ManickaP commented Jul 15, 2021

wegylexy commented Nov 13, 2021

ManickaP commented Apr 22, 2022

wegylexy commented Apr 22, 2022

ManickaP commented Jun 14, 2022

scalablecory commented Dec 11, 2019 •

edited

Loading

MikeBishop commented Dec 17, 2019 •

edited

Loading

scalablecory commented Sep 23, 2020 •

edited

Loading