Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove some allocation at "hello world" startup #44469

Merged
merged 11 commits into from
Nov 11, 2020

Conversation

stephentoub
Copy link
Member

@stephentoub stephentoub commented Nov 10, 2020

Measuring dotnet helloworld.dll that just does Console.WriteLine(“hello world”);, on my machine this reduces startup allocation by ~200 objects / 12Kbytes, and reduces wall-clock time by ~8% (from ~80ms to ~73ms). Wall-clock measurement done with Measure-Command { dotnet helloworld.dll }. Allocation measurements done with VS .NET allocation profiler.

@jkotas, please let me know if you think any of these aren't worthwhile or are problematic in some way and I can roll them back.

Contributes to #44598

cc: @brianrob, @DamianEdwards


For future reference, after this there's still quite a bit of allocation, but it's hard to get rid of a lot more without disabling whole swaths of functionality (e.g. disabling EventSource). Here's what remains:

Type Allocations Bytes
- System.SByte[] 216 9,456
- System.String 43 34,040
- System.Object[] 14 24,040
- System.Byte[] 4 3,731
- System.Int32[] 4 3,580
- System.Diagnostics.Tracing.EventSource.OverideEventProvider 4 448
- Interop.Advapi32.EtwEnableCallback 4 256
- System.Runtime.CompilerServices.GCHeapHash 4 128
- System.Object 3 72
- System.Text.EncoderReplacementFallback 3 72
- System.Text.DecoderReplacementFallback 3 72
- System.Char[] 2 564
- System.Threading.ThreadAbortException 2 256
- System.IntPtr[] 2 208
- System.Text.UTF8Encoding.UTF8EncodingSealed 2 96
- System.RuntimeType 2 80
- System.String[] 2 56
- System.Collections.Generic.NonRandomizedStringEqualityComparer.OrdinalComparer 2 48
- System.Diagnostics.Tracing.TraceLoggingEventHandleTable 2 48
- System.Diagnostics.Tracing.EtwEventProvider 2 48
- System.Diagnostics.Tracing.EventPipeEventProvider 2 48
- System.WeakReference<System.Diagnostics.Tracing.EventSource> 2 48
- System.Diagnostics.Tracing.RuntimeEventSource 1 368
- System.Collections.Generic.Dictionary<System.String, System.Object>.Entry[] 1 192
- System.Diagnostics.Tracing.NativeRuntimeEventSource 1 184
- System.Exception 1 128
- System.OutOfMemoryException 1 128
- System.StackOverflowException 1 128
- System.ExecutionEngineException 1 128
- System.IO.StreamWriter 1 104
- System.Collections.Generic.Dictionary<System.String, System.Object> 1 80
- System.Threading.Tasks.Task<System.Threading.Tasks.VoidTaskResult> 1 72
- System.RuntimeFieldInfoStub 1 64
- System.EventHandler 1 64
- System.Text.OSEncoding 1 64
- System.Threading.ContextCallback 1 64
- System.AppDomain 1 64
- System.Reflection.RuntimeAssembly 1 48
- System.Text.OSEncoder 1 48
- System.IO.TextWriter.SyncTextWriter 1 48
- System.WeakReference<System.Diagnostics.Tracing.EventSource>[] 1 40
- System.ConsolePal.WindowsConsoleStream 1 40
- System.Threading.Tasks.TaskFactory 1 40
- System.IO.TextWriter.NullTextWriter 1 40
- System.Guid 1 32
- System.Diagnostics.Tracing.ActivityTracker 1 32
- System.Collections.Generic.List<System.WeakReference<System.Diagnostics.Tracing.EventSource>> 1 32
- System.Collections.Generic.GenericEqualityComparer<System.String> 1 24
- System.OrdinalCaseSensitiveComparer 1 24
- System.Collections.Generic.NonRandomizedStringEqualityComparer.OrdinalIgnoreCaseComparer 1 24
- System.OrdinalIgnoreCaseComparer 1 24
- System.IO.Stream.NullStream 1 24
- System.Threading.Tasks.Task.<>c 1 24
- System.EventArgs 1 24

A few notes on the remaining allocations:

  • The sbyte[]s are mainly from the JIT reporting inlining decisions, and these are on the managed heap as of Replace multi-loaderallocator hash implementation in MethodDescBackpatchInfo coreclr#22285. Most of the object[]s appear to be as well, coming from reportInliningDecision using GCHeapHash which grows an array.
  • The strings are mainly consts as well as initialization key/value pairs passed into AppContext.Setup.
  • The byte[]s are from StreamWriter's byte[] buffer, Array.Empty, and EventSource "MetadataForString".
  • The char[]s are from StreamWriter's char[] buffer and TextWriter's newline characters array.
  • The int[]s are from Dictionary / HashHelpers as well as invariant culture data.
  • The IntPtr[]s are from one per EventSource instance, used for its TraceLoggingEventHandleTable.
  • The objects are remaining lock objects.
  • The EtwEnableCallback / OverideEvent (sic) / EtwEventProvider / EventPipeEventProvider are from EventSource, incurring one for ETW (on Windows) and one for EventPipe multiplied by RuntimeEventSource and NativeRuntimeEventSource.
  • Encoding-related objects mostly come from Console's output encoding.
  • Exception objects are all singletons pre-allocated by the runtime.
  • There's a boxed Guid coming from native runtime code invoked by EventSource.Initialize
  • API design of yore led to public readonly static fields for Stream.Null, StreamWriter.Null, and TextWriter.Null, causing Stream/StreamWriter/TextWriter instances to be created when these types are first used.
  • Creating a Dictionary causes it to access statics on NonRandomizedStringEqualityComparer, which forces into existence the various singletons on both that class and on StringComparer.
  • Task.CompletedTask forces Task's cctor to run which also allocates TaskFactory, a ContextCallback for running Tasks with ExecutionContext, and a closure <>c-related object.
  • There are couple more we could easily remove (AppDomain, EventArgs, EventHandler) if we were willing to couple AppContext.OnProcessExit with EventListener.DisposeOnShutdown: right now those are connected by EventSource registering an event handler with AppContext.ProcessExit.

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@danmoseley
Copy link
Member

API design of yore led to public readonly static fields for Stream.Null, StreamWriter.Null, and TextWriter.Null, causing Stream/StreamWriter/TextWriter instances to be created when these types are first used.

If these aren't commonly needed, could they be lazily created instead?

@stephentoub
Copy link
Member Author

stephentoub commented Nov 10, 2020

If these aren't commonly needed, could they be lazily created instead?

There's the rub: they're public fields. 😦

@danmoseley
Copy link
Member

Oh, right. Ugh, public fields.

@stephentoub stephentoub added this to the 6.0.0 milestone Nov 10, 2020
@stephentoub stephentoub added the tenet-performance Performance related issue label Nov 10, 2020
@jkotas
Copy link
Member

jkotas commented Nov 10, 2020

if we were willing to couple AppContext.OnProcessExit with EventListener.DisposeOnShutdown

I would be ok with that. EventSource is tightly coupled with the rest of the runtime in multiple of ways, so one more does not make a huge difference. Make sure to sprinkle it with !IsSupported to make it friendly to trimming.

Copy link
Member

@brianrob brianrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I like EventSource.WriteEventString change especially, since it's only used for error conditions, but we pay for it all the time.

@stephentoub
Copy link
Member Author

I would be ok with that

Ok, I'll add that.

Removes ~80 allocations at startup.
It's only used on an error path.  We don't need to allocate it for each EventSource that's created.
SyncTextWriter already overrides FormatProvider, in which case the t.FormatProvider passed to the base will never be used, so this call is incurring a virtual dispatch for no benefit.  And NullTextWriter needn't access InvariantCulture and force it into existence if it isn't yet, as the formatting should never actually be used, and if it is, its FormatProvider override can supply the culture.
…ction with ALC

AssemblyLoadContext.OnProcessExit gets called by EventSource, which in turn forces s_allContexts into existence in order to lock on it in order to enumerate all active contexts, and if there's been no interaction with AssemblyLoadContext, there won't be any to enumerate.  So delay allocate the object.
Avoids the need to register with AppContext.ProcessExit, avoiding an EventHandler allocation, and avoids the need in the common case to fire AppContext.ProcessExit, which in turn avoids allocating an AppDomain and EventArgs if they weren't otherwise created, plus it avoids the delegate invocation.
@GSPP
Copy link

GSPP commented Nov 11, 2020

API design of yore led to public readonly static fields for Stream.Null, StreamWriter.Null, and TextWriter.Null, causing Stream/StreamWriter/TextWriter instances to be created when these types are first used.

Is it feasible to turn them into properties on a major version of .NET? It seems that callers which recompile should keep working. I don't really see why anyone would access these using reflection.

@marek-safar
Copy link
Contributor

System.String 43 34,040

This looks odd, that would make every string used during init almost 1kb

@stephentoub
Copy link
Member Author

stephentoub commented Nov 11, 2020

It seems that callers which recompile should keep working

If they recompile against the new surface area, yes. But it's a binary breaking change. So, for example, any netstandard2.0 library that was compiled against the field will break with a MissingFieldException if it runs against a binary that has it instead as a property.

@stephentoub
Copy link
Member Author

This looks odd, that would make every string used during init almost 1kb

It's not that every string is 1K. It's that most of the strings are small, but then there's one 30K-ish string (created by the char* passed to AppContext.Setup) for the TPA list.

@stephentoub stephentoub merged commit aa04371 into dotnet:master Nov 11, 2020
@stephentoub stephentoub deleted the startupalloc branch November 11, 2020 14:15
@ghost ghost locked as resolved and limited conversation to collaborators Dec 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Meta tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants