-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TraceLog streaming/in-memory EventPipe support #1867
TraceLog streaming/in-memory EventPipe support #1867
Conversation
@microsoft-github-policy-service agree |
@vaind, thanks for working on this. I think at a high level this should work. A couple of thoughts:
|
Thanks for getting back to me, it indeed seems to work from my testing in the Sentry profiling SDK, with some quirks so far (I didn't get stack trace method resolution working yet, but from my understanding it should work because all the info has already come from the runtime). Also the ActivityComputer (and others) don't seem to work when streaming at the moment so that would also need some tuning I guess. I was able to get around them for now but I think they're useful so I'll likely get back to them in the future.
Yes, I've made the change by reusing the same code, really, with a minor change of setting up the
e.g the following line is missing from the captured output but is present in the
Also, all the custom events are missing, as present in I'm pretty sure it's a test-code issue, because as I've mentioned earlier, the overall approach seems to work in general, based on my changes to the profiling code in the Sentry SDK which now starts the profiler early in the process and slices profiles when actually needed (drops the other samples if not currently needed). I think this functionality will need better tests here, instead of trying to reuse what was already present in |
So the issue I'm having with stacktrace info not being available is due to missing modules & method info because the rundown on a session only happens when a session is stopped, while I'm trying to access this info while the sample-profiler is running (on each sample). I've raised an inquiry for this issue: dotnet/runtime#86103 I was wondering whether I could start & stop a session, in the beginning, to force the rundown to happen and then use these events to feed TraceLog. Afterwards, the runtime information would rely on the runtime provider to get updates. @brianrob do you think this could work, conceptually or do you see any issues that would need to be overcome (or prevent this completely)? |
3140653
to
031ce2d
Compare
So I've verified this works fine, I'll just need to add test cases to this repo so that it's covered here. |
927865d
to
8a3ac13
Compare
@brianrob Any chance you could have a first look at |
@vaind, apologies for not getting to this sooner. I am reviewing your code. |
@vaind, the concept here looks good. One thing I think we will need to resolve before this is ready for review is the rundown design. Your instinct to trigger a rundown at the beginning of the trace is the right one. I was going to recommend that you use the @davmason, do you know if rundown is configurable at all? If I recall correctly, end rundown is hardcoded at the end of the session. Is that right? I'm trying to figure out the best option for @vaind to trigger a start rundown at the beginning of streaming so that @vaind, once this is resolved, then I think it will make sense to review this more. I saw some diffs that I want to examine further, but I'm not on a great network connection today and so I don't have access to a quality diff tool. |
Thanks for getting back to me @brianrob. And yes, there doesn't seem a way to trigger a rundown on the same session. I've discussed that in the runtime issue, I've created two weeks ago: dotnet/runtime#86103 |
8a3ac13
to
0aeb827
Compare
Rundown is not currently configurable with EventPipe. You are remembering correctly, it is hardcoded to run at the end of the session. The internal code has the option to configure whether to do rundown or not, but we don't expose it over the IPC commands so customers can't turn it off or on. It would be fairly easy to expose the option to have rundown or not, and I don't think it would be that much work to have it at the start of a trace. I think we could reuse all the existing code |
Thanks @davmason. Sounds like we should keep dotnet/runtime#86103 open to track making it possible to configure rundown via IPC, but for now, hacking things a bit here to use @vaind, once you're ready, go ahead and mark this as ready for review and I'll take a more thorough pass on it. |
I've done so. It does need tests but I want to make sure the approach and APIs are OK-ish before trying to devise tests for them. |
P.S. I've tested it as a consumer in Sentry .NET SDK. What I meant by needing tests is that it needs tests in this repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience. Here's a first cut at the review. I think we may have a few iterations, but overall, this is looking good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made some changes as requested and replied to some questions.
Keeping the testing open until the questions are resolved.
@brianrob Have you been able to finish reviewing this? Any more concerns or do you think it would be acceptable after adding tests for the changes/new APIs? Also, do you have any preferences for the testing approach? |
@brianrob this would really help unblock some work we want to do for .NET here at Sentry. Another round of review much appreciated |
Hi @brianrob, I've updated this PR with the latest changes from the main branch and all the CI jobs pass. Is there anything I should do to so that these changes can eventually land? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vaind thank you for your patience on this and for all of the work that you've done thus far. I think we're getting close. I've added a few comments here, but I don't think any are big.
I see the following as the path to merge:
- @noahfalk, can I get you to give this a review please? This is for livesession support for EventPipe.
- I would like to validate that we don't regress live session support for ETW traces. Have you done any testing in this space? I know we don't have tests here in the repo, but presumably, we can do some adhoc testing, similar to https://github.com/brianrob/examples/tree/main/src/realtime-session-with-stacks.
Assuming we can resolve this, I think we'll be ready to merge (or at least know what we need to do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looked fine to me, but I warn my knowledge of the TraceLog portion of TraceEvent is limited so I wouldn't read too much into it. If there were anything subtly wrong I don't expect I would catch it without spending a while researching the internals of TraceLog.
Thanks @noahfalk. No worries - I just wanted to get another set of eyes on this, and also to make you aware of the functionality. |
wooo thanks @noahfalk ! Getting there |
418dd93
to
5153769
Compare
@brianrob thanks for the review, somehow I've missed it until now. I've addressed your requests and updated the branch. Let me know if there's anything else needed. |
P. S. besides review changes, there was one more fix |
This PR had its 1 year anniversary 😅 It's got all the review points addressed. Could we please get some help pushing this through 🙏 |
@bruno-garcia too bad that Elon tanked Twitter 😝 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. One oustanding issue and then I think we're about ready to merge.
Yup! Happy Anniversary? |
FYI I'm investigating an unbound memory growth when streaming EventPipe. At first, I suspected the cause is this On the other hand, if I run the same app with |
I don't expect the It's hard to say where the unmanaged memory growth is coming from. Does it repro in a minimal repro app? It might be worth looking at a GC trace (GCStats in PerfView) to see if the GC heap is increasing in size and this is just fragmentation being held, or if there is truly a native leak. Either way, I am OK taking this change as is, and we can address this in a follow-up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience and for bringing this across the finish line!
I'm trying to make changes to enable
TraceLog
use without creating an ETLX file from an EventPipe (nettrace). I've taken the existing ETW streaming support as a baseline and tried to adapt that. However, I'm having trouble getting the newly added tests to pass because the events coming from TraceLog don't match what I'd get fromEventPipeEventSource
(the previous test case,Streaming
in the same file).@brianrob maybe if you could have a look at my changes, you could spot an issue what I'm doing wrong (or not doing and I should be) to get this working properly? Or is there some bigger problem I'm missing completely that prevents this from happening?
The use-case I'm going after is a continuously running sampling profiler that only samples as needed (because actually starting the profiler takes a lot of time in the CLR so I want to do it only once).
This is a follow up, though not strictly an implementation of, #1829