TraceEvent: Getting allocation stacks on Linux #2057

ocoanet · 2024-07-03T11:52:35Z

I am trying to create a simple OS independent allocation detection program. The detection program:

runs a target program using "dotnet-trace" and
analyses the trace file to get allocations and allocations call stacks.

I can get allocation events (ClrTraceEventParser.GCSampledObjectAllocation) by using the allocations keywords. However, I cannot get stack events (ClrTraceEventParser.ClrStackWalk) by using the ClrTraceEventParser.Keywords.Stack keyword. TraceLog is still able to retrieve stack information from the allocation events using TraceLog.ProcessExtendedData, but the code is internal and seems to be Windows-specific. Also, I would like to avoid converting my event pipe files to ETLX because it consumes too much memory and crashes on low-memory containers.

My questions are:

What is the effect of the ClrTraceEventParser.Keywords.Stack keyword on Linux?
Is it possible to attach stack traces to allocation events on Linux?
Is there is a way to get stack walk events (ClrTraceEventParser.ClrStackWalk) on Linux?

The text was updated successfully, but these errors were encountered:

ocoanet · 2024-07-05T08:02:03Z

I did additional tests on Linux and it turns out that the trace events contain extended data with stack traces. So, ProcessExtendedData is also useful on Linux even though the data structures used in this method seem to be Windows-specific.

I will probably close this issue because it somehow answers most of my questions. However, it would be nice if I could get the confirmation that:

The ClrTraceEventParser.Keywords.Stack keyword effect is to capture stack traces and include them in the trace event extended data (on Windows and Linux).
There is no clean way to read extended data from event pipe traces without using TraceLog (converting the traces to to ETLX).

Also, it would be interesting to know if ClrTraceEventParser.ClrStackWalk can be used to get stacks with dotnet-trace.

ocoanet · 2024-07-12T14:15:47Z

I will investigate alternative ways to trace allocations on Linux without dotnet-trace, either by manually creating a diagnostic-port based EventPipeSession or by using CreateProcessForLaunch.

Anyway, both solutions are related to https://github.com/dotnet/diagnostics so I am closing this issue.

brianrob · 2024-07-12T15:43:32Z

Sorry for the delay @ocoanet, I have been out of the office. The Stack keyword isn't used on Linux, as its implementation is specific to ETW on Windows. EventPipe handles stack capture differently than ETW, and will capture and store the stacks differently as well. ProcessExtendedData gets fully filled in for ETW traces, but not for any of the Linux tracing mechanisms. EventPipe (and dotnet-trace) will capture stacks for allocation events as long as the events are enabled - they will be managed-only stacks, but this is currently the only mechanism on Linux that can get you allocation stacks. I think the solution you've landed on (EventPipeSession) sounds right to me.

ocoanet · 2024-07-13T14:00:08Z

Thank you very much for your response.

To be honest I still do not know how to get stack traces from EventPipeEventSource.

Here are my options:

Start the process using dotnet-trace, then convert the nettrace file to ETLX using TraceLog.CreateFromEventPipeDataFile and then parse the file using TraceLog. In this case the stack traces can then be loaded using the TraceEvent.CallStack() extension method, which is only implemented for events created by TraceLog. It works but I would like to do "live" tracing.
Start the process using dotnet-trace, then parse the nettrace file using EventPipeEventSource. In this case I cannot find a way to get the stack traces. I suspect that the GCSampledObjectAllocationTraceData events contain the stack traces in the event record "ExtendedData" but there is no public API to read them.
Use internal methods with IgnoresAccessChecksTo to start a "live" event pipe session on a new process (i.e.: mimic dotnet-trace implementation), then create a stream-based EventPipeEventSource. Again, I cannot find a way to get the stack traces, but at least I get live tracing with allocation detection.

Note that I cannot start the process normally before creating the EventPipeEventSource because the allocation keywords must be enabled before the application starts.

It seems that my best option is to go deeper in the "evil" path and read the stack traces from the internal event record "ExtendedData".

brianrob · 2024-07-14T21:17:50Z

Gotcha. Let's make sure you have a good path forward. Option 1 above is what you should do. The raw sources (EventPipeEventSource, ETWTraceEventSource, etc.) don't do the heavy lifting to enable stacks. This work is done for you when you convert to a TraceLog`.

If you want to do a live session, there is very new support for this, but it's not available in a released TraceEvent yet. It was just merged - #1867. I expect this to be in the next version of TraceEvent.

I see that you're using GCSampledObjectAllocationTraceData. This is a smart sampling mechanism that takes over the fast allocation helpers - meaning that all allocations will be slow allocations. You might consider using the AllocationTick event instead - it's sampled every 100K on each thread, but has less of a performance impact.

ocoanet · 2024-07-15T13:01:36Z

Thank you for your answer.

I see that you're using GCSampledObjectAllocationTraceData. This is a smart sampling mechanism that takes over the fast allocation helpers - meaning that all allocations will be slow allocations. You might consider using the AllocationTick event instead - it's sampled every 100K on each thread, but has less of a performance impact.

Yes, I specifically want to detect all allocations and the performance impact is not important in my use case (I am not profiling the production instances).

ocoanet closed this as completed Jul 12, 2024

ocoanet reopened this Jul 13, 2024

ocoanet closed this as completed Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TraceEvent: Getting allocation stacks on Linux #2057

TraceEvent: Getting allocation stacks on Linux #2057

ocoanet commented Jul 3, 2024

ocoanet commented Jul 5, 2024

ocoanet commented Jul 12, 2024

brianrob commented Jul 12, 2024

ocoanet commented Jul 13, 2024

brianrob commented Jul 14, 2024

ocoanet commented Jul 15, 2024

TraceEvent: Getting allocation stacks on Linux #2057

TraceEvent: Getting allocation stacks on Linux #2057

Comments

ocoanet commented Jul 3, 2024

ocoanet commented Jul 5, 2024

ocoanet commented Jul 12, 2024

brianrob commented Jul 12, 2024

ocoanet commented Jul 13, 2024

brianrob commented Jul 14, 2024

ocoanet commented Jul 15, 2024