Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NativeAOT] Best way to track down app size bloat? #1269

Open
svengeance opened this issue Jun 23, 2021 · 15 comments
Open

[NativeAOT] Best way to track down app size bloat? #1269

svengeance opened this issue Jun 23, 2021 · 15 comments
Labels
area-NativeAOT-coreclr .NET runtime optimized for ahead of time compilation

Comments

@svengeance
Copy link

svengeance commented Jun 23, 2021

Hey all,

I'm looking for the recommended approach to track down exactly what is contributing to the large size of my app. I'm sitting at roughly 12MB right now with all optimizations enabled. After getting rid of everything that makes C# feel like home (MS.Ext.Logging, MS.Ext.DI, etc..) I've got two dependencies - FluentValidation and Serilog at this time. I am likely going to ax FV, but I'd prefer not to write my own logger.

(It may sound far-reaching but size is of utmost importance here, and I'm trying to stay close to that 1.4MB hello world where I can)

<DebuggerSupport>false</DebuggerSupport>
<EnableUnsafeBinaryFormatterSerialization>false</EnableUnsafeBinaryFormatterSerialization>
<EnableUnsafeUTF7Encoding>false</EnableUnsafeUTF7Encoding>
<EventSourceSupport>false</EventSourceSupport>
<HttpActivityPropagationSupport>false</HttpActivityPropagationSupport>
<UseSystemResourceKeys>true</UseSystemResourceKeys>

<IlcDisableUnhandledExceptionExperience>true</IlcDisableUnhandledExceptionExperience>
<IlcGenerateStackTraceData>false</IlcGenerateStackTraceData>
<IlcInvariantGlobalization>true</IlcInvariantGlobalization>
<IlcOptimizationPreference>Size</IlcOptimizationPreference>

<IlcFoldIdenticalMethodBodies>true</IlcFoldIdenticalMethodBodies>

<IlcDisableReflection>true</IlcDisableReflection>

<IlcGenerateMapFile>true</IlcGenerateMapFile>
<IlcGenerateMetadataLog>true</IlcGenerateMetadataLog>
<IlcGenerateDgmlFile>true</IlcGenerateDgmlFile>

As you can see on the bottom I've got the debug files enabled, however I am unable to reasonably read through them; the dgml file never loads in any viewer I've tried (> 100MB/100k nodes), and it's hard to track the causal relationships in the Mapfile.

So far to be honest the only real direction I've been able to get from these files in a small amount of time is searching for a dependency package and seeing how many lines are related.

If I must, I will find a way to parse these files and even produce some tooling around it. However, if any exists, I would appreciate some guidance.

Thanks all.

@jkotas
Copy link
Member

jkotas commented Jun 23, 2021

https://github.com/dotnet/corert/tree/master/src/ILCompiler.DependencyAnalysisFramework/WhyDgml is a tool that allows you to find out why a given item is included in the image. We do not have packages for this tool - you have to build it yourself. If you run into something that you would not expect it to be in the image, you can use this tool to find out how it got there.

@jkotas jkotas added the area-NativeAOT-coreclr .NET runtime optimized for ahead of time compilation label Jun 23, 2021
@svengeance
Copy link
Author

svengeance commented Jun 23, 2021

Hey Jan, thanks for the direction. I'm surprised at how much output it has (well overflowing windows terminal's default # of lines); I'm wondering if that's a misstep on my part somewhere, or if it's expected.

I will continue fiddling with Why to see if I can make more sense out of it. Thanks for the initial lead.

@svengeance
Copy link
Author

svengeance commented Jun 23, 2021

Actually reading slightly further I might be running into -

// It works best if FirstMarkLogStrategy is used. It might hit cycles if full
// marking strategy is used, so better not try that. That's untested.

Is this avoidable?

@MichalStrehovsky
Copy link
Member

I often open the map file in Notepad and scroll through it to see if there's any common themes (a generic method instantiated over dozens of types, etc.).

Is this avoidable?

Double check the *.ilc.rsp file in your obj directory but unless you see --fulllog or --scanfulllog switch, it's using the FirstMarkLog.

Make sure to also include <TrimMode>link</TrimMode> and <TrimmerDefaultAction>link</TrimmerDefaultAction> in your csproj.

@svengeance
Copy link
Author

svengeance commented Jun 23, 2021

I often open the map file in Notepad and scroll through it to see if there's any common themes (a generic method instantiated over dozens of types, etc.).

Good to know my scientific method of opening the map in vscode, hitting ctrl-f, searching serilog and comment wow that's a lot of highlighted text wasn't off-base 😄

After removing Serilog my application dropped from 11MB to 3.2MB. Similarly, my project.assets.json file dropped from 240kb to 80kb. I haven't done any more analysis as to why in particular, since I'm really only using Console and File logging in Serilog...I'll be experimenting with other loggers to see what I see.

I'm trying to build an installer framework - as far as I know no one's done this in any modern version of C#, instead falling back to Net4.6.1 for the framework-dependent installation. But hopefully this explains a bit why I'm interested in the size.

As an outgoing question - if there were any adventurous folk were interested in building some better tooling in this area, what are some good patterns to observe when analyzing the map/dgml/metadata? At the least there should be a tool that can visualize the dgml outside of visual studio.

@MichalStrehovsky
Copy link
Member

As an outgoing question - if there were any adventurous folk were interested in building some better tooling in this area, what are some good patterns to observe when analyzing the map/dgml/metadata? At the least there should be a tool that can visualize the dgml outside of visual studio.

The hard part of visualizing the data is that things are often referenced from multiple places (e.g. a method that is called from two places). I haven't yet figured out how to visualize that (do we count the size of the common method as part of the "weight" of both calling methods?). Then there's things like - someone calls an interface method. Someone news up a class (one of many classes) that implement the interface. Who do we attribute the "weight" of the interface implementation method? What if the interface method is also called not through the interface?

@davidfowl
Copy link
Member

@jkotas had a good idea to treat it like a memory profile with inclusive exclusive size.

@jkotas
Copy link
Member

jkotas commented Jun 23, 2021

If you add a way to import the data to https://github.com/microsoft/perfview, you will automatically get all types of analysis and aggregation options that perfview can do on time series. For example, ignore a specific module in the analysis, total cost of all items from given namespace, inclusive cost of a module and all its dependencies, ... . perfview should be able to do all that and more out of the box.

@kant2002
Copy link
Contributor

@jkotas
Copy link
Member

jkotas commented Jul 20, 2021

Yes. We can make ilc to generate this format to see how well it works.

@davidfowl
Copy link
Member

cc @brianrob just in case he has any feedback for this idea

@brianrob
Copy link
Member

It's a great idea. PerfView has an XML and JSON StackSource format, so if you create one of these and name the file with the right extension, it will just show up in PerfView and open in the stack viewer.

PerfView also has another feature that might be of some help here: https://github.com/microsoft/perfview/blob/main/src/PerfView/UserCommands.cs#L945. You can run this by running PerfView.exe UserCommand ImageSize <exefile>. It will produce a file called <exefile>.imageSize.xml, which can be opened by PerfView, and looks like a GC heap snapshot. You will need the PDB to be next to the exe. As an example, I ran this on dotnet.exe, and got this:
image
This is all stackified data, so you can see the connectivity graph.

@kant2002
Copy link
Contributor

@brianrob I run ImageSize <exefile> via UI and in the status bar receive Error: The PerfViewExtensions\PdbScope.exe file does not exit. ImageSize report not possible. I use PerfView built from source code. Based on this commit microsoft/perfview@4513598 it was removed.

@kant2002
Copy link
Contributor

After I download PdbScope.exe and msvcdis110.dll from that commit and run, Image creation. I have some issues during creation, which I fix in microsoft/perfview#1468. I was able to load WinForms app into it.
image

Cannot say that I'm satisfied with speed for this process.

@filipnavara
Copy link
Member

There's now a semi-official way to analyze this using the sizoscope or it's cross-platform version. The tutorial is in the GitHub README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-NativeAOT-coreclr .NET runtime optimized for ahead of time compilation
Projects
None yet
Development

No branches or pull requests

7 participants