-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve connection scavenge logic by doing zero-byte read #50545
Conversation
Tagging subscribers to this area: @dotnet/ncl Issue DetailsCurrently, during our periodic scavenge of idle connections, we check whether the connection is still valid (i.e. has not received EOF or invalid data) by either (a) doing a socket poll, if we have access to the underlying socket; or (b) issuing a read-ahead, when we don't have access to the underlying socket. Approach (b) is problematic because it pins our read buffer for potentially a long period of time (on Windows), and prevents us from freeing this buffer back to the pool in the future. The negative impact of this approach can be seen here: #49431 Unfortunately, there are common cases where we don't have access to the underlying socket, and thus today have to fall back to approach (b). Most importantly, if a user supplies a ConnectCallback, then we never have access to the underlying socket, even if the ConnectCallback does trivial things like logging. This PR addresses the problem as follows: Instead of doing a poll or a read-ahead in these cases, just do a zero-byte read. For Streams that implement proper zero-byte read support (that is, the read does not complete until data is available), this will avoid unnecessary buffer pinning. It's also probably cheaper than polling, since we only need to pend one zero-byte read in this case vs potentially multiple calls to poll. Since we cannot depend on the Stream properly supporting zero-byte read semantics, we must issue an actual read-ahead when the zero-byte read completes. For non-compliant Streams, this results in the same behavior as previously: we will immediately pin the read buffer. For compliant Streams, the completion will only happen when either the connection is invalid (as occurs today) or when there is actual valid response data on the connection, in which case we would have issued a read-ahead when the connection is about to be reused anyway. Note that we also issue a read-ahead when we retrieve an idle connection from the pool for reuse. This logic is essentially unchanged, but refactored a bit, and no longer shared with the scavenge logic since we want different behavior in these two cases. Also, this PR changes the behavior in ReturnConnection (called when a connection is about to be returned to the pool) to not check for EOF or invalid data in the case where we are about to transfer the connection to a waiting request. Any invalid data would likely have been detected as part of response processing. And since we are about to reuse the connection anyway, the likelihood of detecting an invalid connection during the transfer is very small and highly timing-dependent. Note this only affects cases where a connection limit is set, since this is the only case in which we have waiting requests currently. Related to #49431 @scalablecory @stephentoub @wfurt @dotnet/ncl
|
/azp run runtime-libraries-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, but otherwise LGTM.
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs
Outdated
Show resolved
Hide resolved
#pragma warning restore CA2012 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
Could be:
// We may already have a read-ahead task if we did a previous scavenge and haven't used the connection since.
#pragma warning disable CA2012 // we're very careful to ensure the ValueTask is only consumed once, even though it's stored into a field
_readAheadTask ??= ReadAheadWithZeroByteReadAsync();
#pragma warning restore CA2012
if (connection.LifetimeExpired(nowTicks, _poolManager.Settings._pooledConnectionLifetime)) | ||
{ | ||
if (NetEventSource.Log.IsEnabled()) connection.Trace("Found usable connection in pool."); | ||
return new ValueTask<HttpConnection?>(connection); | ||
if (NetEventSource.Log.IsEnabled()) connection.Trace("Found expired connection in pool."); | ||
connection.Dispose(); | ||
continue; | ||
} | ||
else | ||
|
||
if (!connection.PrepareForReuse(async)) | ||
{ | ||
if (NetEventSource.Log.IsEnabled()) connection.Trace("Found invalid connection in pool."); | ||
connection.Dispose(); | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bodies of these if blocks are identical except for one word in the traced message. If that distinction isn't important, we could combine these:
if (connection.LifetimeExpired(nowTicks, _poolManager.Settings._pooledConnectionLifetime) ||
!connection.PrepareForReuse(async))
{
if (NetEventSource.Log.IsEnabled()) connection.Trace("Found expired or invalid connection in pool.");
connection.Dispose();
continue;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distinction seems important to me.
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnectionPool.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnection.cs
Outdated
Show resolved
Hide resolved
…andler/Http2Connection.cs Co-authored-by: Stephen Toub <[email protected]>
…andler/Http2Connection.cs Co-authored-by: Stephen Toub <[email protected]>
/azp run runtime-libraries-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
…shim_mono # By Aaron Robinson (10) and others # Via GitHub * upstream/main: (108 commits) [mbr] Add Apple sample (dotnet#50740) make EstablishProxyTunnelAsync throw on failure status code from proxy (dotnet#50763) Improve RGB Min Max evaluation performance by using 2 or 3 comparison… (dotnet#50622) [mono] More domain cleanups (dotnet#50479) Fix Crossgen2 of PlatformDefaultMemberFunction methods and calls. (dotnet#50754) Disable EventSource generator in design-time builds (dotnet#50741) Fix X509 test failures on Android (dotnet#50301) Do not confuse fgDispBasicBlocks in fgMorphBlocks (dotnet#50703) Enforce 64KB event payload size limit on EventPipe (dotnet#50600) Reorganize CoreCLR native build to reduce CMake reconfigures when the build system is untouched (dotnet#49906) [mbr] Turn on hot reload for iOS, tvOS and MacCatalyst (dotnet#50458) improve connection scavenge logic by doing zero-byte read (dotnet#50545) Resolve call mdtokens when making tier 1 inline observations (dotnet#50675) Annotate APIs in System.Private.Xml (dotnet#49682) Support compiling against OpenSSL 3 headers Change Configuration.Json to use a regular Dictionary. (dotnet#50611) Remove unused BigNumFromBinary P/Invoke (dotnet#50670) Make Ninja the default CMake generator on Windows for the repo (dotnet#49715) [AppleAppBuilder] Entitlements to run tests on catalyst using the JIT (dotnet#50637) [mono] Fix delegate invokes to dynamic methods in mixed mode. (dotnet#50547) ... # Conflicts: # src/mono/dlls/mscordbi/CMakeLists.txt
Currently, during our periodic scavenge of idle connections, we check whether the connection is still valid (i.e. has not received EOF or invalid data) by either (a) doing a socket poll, if we have access to the underlying socket; or (b) issuing a read-ahead, when we don't have access to the underlying socket.
Approach (b) is problematic because it pins our read buffer for potentially a long period of time (on Windows), and prevents us from freeing this buffer back to the pool in the future. The negative impact of this approach can be seen here: #49431
Unfortunately, there are common cases where we don't have access to the underlying socket, and thus today have to fall back to approach (b). Most importantly, if a user supplies a ConnectCallback, then we never have access to the underlying socket, even if the ConnectCallback does trivial things like logging.
This PR addresses the problem as follows: Instead of doing a poll or a read-ahead in these cases, just do a zero-byte read. For Streams that implement proper zero-byte read support (that is, the read does not complete until data is available), this will avoid unnecessary buffer pinning. It's also probably cheaper than polling, since we only need to pend one zero-byte read in this case vs potentially multiple calls to poll.
Since we cannot depend on the Stream properly supporting zero-byte read semantics, we must issue an actual read-ahead when the zero-byte read completes. For non-compliant Streams, this results in the same behavior as previously: we will immediately pin the read buffer. For compliant Streams, the completion will only happen when either the connection is invalid (as occurs today) or when there is actual valid response data on the connection, in which case we would have issued a read-ahead when the connection is about to be reused anyway.
Note that we also issue a read-ahead when we retrieve an idle connection from the pool for reuse. This logic is essentially unchanged, but refactored a bit, and no longer shared with the scavenge logic since we want different behavior in these two cases.
Also, this PR changes the behavior in ReturnConnection (called when a connection is about to be returned to the pool) to not check for EOF or invalid data in the case where we are about to transfer the connection to a waiting request. Any invalid data would likely have been detected as part of response processing. And since we are about to reuse the connection anyway, the likelihood of detecting an invalid connection during the transfer is very small and highly timing-dependent. Note this only affects cases where a connection limit is set, since this is the only case in which we have waiting requests currently.
Related to #49431
@scalablecory @stephentoub @wfurt @dotnet/ncl