Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unmanaged memory growth with DiagnosticsClient #105132

Closed
vaind opened this issue Jul 19, 2024 · 8 comments
Closed

Unmanaged memory growth with DiagnosticsClient #105132

vaind opened this issue Jul 19, 2024 · 8 comments

Comments

@vaind
Copy link

vaind commented Jul 19, 2024

Description

I'm running a diagnostic session on the program itself (for in-process profiling) and am facing issues noticable in ASP.NET core apps, although I think it's only showing there due to the number of events being produced.

Reproduction Steps

I've created a simple app which just connects a diagnostic session and ignores incoming events:

  1. dotnet new webapi --use-controllers -o webapi
  2. dotnet add package Microsoft.Diagnostics.Tracing.TraceEvent --version 3.1.13
  3. dotnet add package Microsoft.Diagnostics.NETCore.Client --version 0.2.510501
  4. set "System.GC.HeapHardLimit": 5242880 in runtimeconfig to make sure GC runs early
  5. update Program.cs, see repo: https://github.com/vaind/webapi-mem
  6. dotnet build -c Release
  7. run the program and observe unmanaged memory growth

Expected behavior

I'd expect the unmanaged memory to cap at the given circularBufferMB: 16 + some baseline for the runtime of course.

Actual behavior

Memory grows over time:
image

And after a few hours:
image

Regression?

No response

Known Workarounds

No response

Configuration

.NET SDK:
 Version:           8.0.303
 Commit:            29ab8e3268
 Workload version:  8.0.300-manifests.7ccdde63
 MSBuild version:   17.10.4+10fbfbf2e

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.22631
 OS Platform: Windows
 RID:         win-x64
 Base Path:   C:\Program Files\dotnet\sdk\8.0.303\

Other information

No response

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jul 19, 2024
Copy link
Contributor

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

@bruno-garcia
Copy link
Member

Any help on this will be much appreciated.

@filipnavara
Copy link
Member

I split up the repro into a producer and consumer process. Then I compiled the event pipe consumer as NativeAOT to get better stack traces from native tools.

There's one stack trace that stands out:

S_P_CoreLib_Interop_Sys__Malloc	webapi	
S_P_CoreLib_System_Runtime_InteropServices_NativeMemory__Alloc_0	webapi	
S_P_CoreLib_System_Runtime_InteropServices_Marshal__AllocHGlobal_0	webapi	
S_P_CoreLib_System_Runtime_InteropServices_Marshal__AllocHGlobal	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipeEventMetaDataHeader__GetEventRecordForEventData	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipeEventSource__ConvertEventHeaderToRecord	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipeEventSource__EventCache_OnEvent	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipe_EventCache__SortAndDispatch	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipe_EventCache__ProcessEventBlock	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipeEventBlock__ReadBlockContents	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipeBlock__FromStream	webapi	
Microsoft_Diagnostics_FastSerialization_FastSerialization_Deserializer__ReadObjectDefinition	webapi	
Microsoft_Diagnostics_FastSerialization_FastSerialization_Deserializer__ReadObject	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_EventPipeEventSource__Process	webapi	
Microsoft_Diagnostics_Tracing_TraceEvent_Microsoft_Diagnostics_Tracing_Etlx_TraceLogEventSource__Process	webapi	

Notably, this native allocation:
https://github.com/microsoft/perfview/blob/f3590351cc5af3a6cbef74fbd12719b05e4bcc9f/src/TraceEvent/EventPipe/EventPipeEventSource.cs#L1173

...may not be freed because the code that converts TraceEvents sets the field to null without ever deallocating it, and the structure itself gets reused:
https://github.com/microsoft/perfview/blob/f3590351cc5af3a6cbef74fbd12719b05e4bcc9f/src/TraceEvent/TraceLog.cs#L3282

I didn't try to verify the theory, there's a lot of code in-between the two places. The only thing I know for sure that the same allocation stack trace keeps popping up all the time in the allocation logs.

@dahlbyk
Copy link
Contributor

dahlbyk commented Sep 18, 2024

@filipnavara it's meant to be freed here:
https://github.com/microsoft/perfview/blob/f3590351cc5af3a6cbef74fbd12719b05e4bcc9f/src/TraceEvent/EventPipe/EventPipeEventSource.cs#L1116

But the instances appear to be cached in _eventMetadataDictionary and never released?

@filipnavara
Copy link
Member

filipnavara commented Sep 18, 2024

@filipnavara it's meant to be freed here: https://github.com/microsoft/perfview/blob/f3590351cc5af3a6cbef74fbd12719b05e4bcc9f/src/TraceEvent/EventPipe/EventPipeEventSource.cs#L1116

Meant to be, yes. The eventRecord is passed around, the ExtendedData field is set to null, and then it gets reallocated with every new event in the event pipe though. All that happens with a single instance of EventPipeEventMetaDataHeader.

But the instances appear to be cached in _eventMetadataDictionary and never released?

There's a single instance cached in _eventMetadataDictionary in the repro.

@filipnavara
Copy link
Member

If my theory is correct then this should fix the leak: microsoft/perfview@main...filipnavara:perfview:memleak-fix

@noahfalk
Copy link
Member

noahfalk commented Oct 1, 2024

Fixed. Thanks @filipnavara for PR in the perfview repo!

@noahfalk noahfalk closed this as completed Oct 1, 2024
@vaind
Copy link
Author

vaind commented Oct 3, 2024

Thanks @filipnavara seems to be fixed in my manual tests too

@github-actions github-actions bot locked and limited conversation to collaborators Nov 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants