Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make DiagnosticClient start the client process to capture early stages events #1810

Open
raffaeler opened this issue Dec 10, 2020 · 17 comments

Comments

@raffaeler
Copy link

Background and Motivation

The current constructor for DiagnosticClient only accepts the process id.
Anyway there are certain events happening only at the very early stages. I am not sure whether starting the process as suspended (non trivial if done cross platform) could help as the diagnostic server would also be suspended.

Proposed Feature

The simplest solution would be to make DiagnosticClient accept the executable file name as well as its arguments so that the child process can be attached at the early stages and capture/collect the very first events.

If this is already possible, please explain how.

@raffaeler raffaeler added the enhancement New feature or request label Dec 10, 2020
@sebastienros
Copy link
Member

I am facing the same issue, and have to place the events that happen at the very beginning of my application as late as possible.
Instead of providing an executable file, it could wrap a Process instance since it has everything required to start the process.
A related issue though is that client.StartEventPipeSession(providerList) will fail since the named pipe takes some time to be created after the app has started, and in my case I need to use a retry loop to catch when it's ready, or I get a ServerNotAvailableException exception.

@raffaeler
Copy link
Author

@sebastienros The .NET 5 version of dotnet-trace is currently able to run the process and internally use a helper to do that.

But I am not sure it is able to capture the very first events. The early events are fundamental.
I tried to run the process with the Process class and immediately after start the DiagnosticClient but this just hangs for the reason you mentioned.
Hopefully @sywhang can clarify better than me what's happening and whether a workaround exists.

@sebastienros
Copy link
Member

@raffaeler Here is the code I use to get it working: https://github.com/dotnet/crank/blob/master/src/Microsoft.Crank.Agent/Startup.cs#L3811-L3871

The delay could probably be smaller, from experience.
This code is started right after process.Start() and so far I have been able to catch the early events.

@sebastienros
Copy link
Member

Unless you are talking about the cli tools only, not the APIs, in which case the name of the exe would make more sense.

@raffaeler
Copy link
Author

I see you retrying in the loop, but this is not the best option in my case. I really need to trap the very first moments and I am afraid to lose something.

@sywhang
Copy link
Contributor

sywhang commented Dec 10, 2020

Thanks @raffaeler! As you mentioned, capturing early runtime events isn't possible with the public set of APIs for diagnostics client library.

That being said, this is already possible with .NET 5 runtime changes; we just haven't standardized on a public API yet - this is one of the items that we are going to be working on in the next few months (See #1794). I will actually use this issue to track the work on the diagnostics client library side.

@raffaeler
Copy link
Author

Great to hear that @sywhang
May I ask you to be more specific about the .NET 5 APIs that would allow me to do that (even if without DiagnosticClient).
I understand that I will have to wait for the next wave to see the diagnostics to be harmonized and simplified, but I would like to put my hands on the lower-level API just to understand the machinery and solve some current problems.

Thanks!

@sywhang
Copy link
Contributor

sywhang commented Dec 10, 2020

@raffaeler of course.

In .NET 5 we added "diagnostic port" feature to the runtime, which allow the runtime to connect to the port. You can think of it as a reverse mechanism of the "diagnostics server" the runtime had since .NET Core 3.0. When you launch a process with the environment variable DOTNET_DiagnosticPorts=<path-to-port>, it will attempt to connect to the port and block at startup.

To make use of this, you need to do the following:

  1. Create a port and listen for connections.
  2. Launch a child app with the environment variable described above.
  3. Start an EventPipe session.
  4. Tell the child app to continue executing by sending ResumeRuntime command.

We're trying to make a public API that will let you do these with few lines of code (just like we did for starting an EventPipe session in .NET Core 3.1).

You can read more about this in our docs on diagnostic ports as well.

Hope that helps!

@raffaeler
Copy link
Author

Thank you @sywhang. I now understand better the meaning of the code I was reading in these hours.
I'll give a try and eventually come back here.

@raffaeler
Copy link
Author

@sywhang I was able to make it work, but I need a small clarification please.

Question: I expected to be able to call ResumeRuntime after creating the EventPipeEventSource, but if I do it, it hangs in the creation indefinitely. Is there not the risk to lose events on startup?

What I am doing:

  1. Create and start thje DiagnosticServer (cloned from this repo)
  2. Call AcceptAsync (getting just the task)
  3. Run the debuggee with the appropriate diagnostic port in the environment (suspended)
  4. Await for the Task above and get the endpoint
  5. Create the DiagnosticClient with the endpoint
  6. On a different thread run the following:
    A. DiagnosticClient.StartEventPipeSession
    B. DiagnosticClient.ResumeRuntime
    C. Create the EventPipeEventSource using the session.EventStream obtained before
    D. Subscribe the events

Thank you

@sywhang
Copy link
Contributor

sywhang commented Jan 26, 2021

Hi @raffaeler,

Apologies for the delay! I was away for vacation at the end of the year and this slipped off my radar after I came back...

: I expected to be able to call ResumeRuntime after creating the EventPipeEventSource, but if I do it, it hangs in the creation indefinitely.

The constructor to EventPipeEventSource blocks until the first event is read from the pipe. What you want to do is in the order of:

  1. Start EventPipe session
  2. ResumeRuntime
  3. Create EventPipeEventSource

You won't lose events on startup because those are not written directly to the port. They're written to an internal buffer in the runtime that stores it until the events are flushed. As long as the buffer size is not set ridiculously small, and you started the session using the diagnostic ports, it should be early enough in the runtime startup to capture the interesting runtime events at startup.

Hope that helps!

@raffaeler
Copy link
Author

Thank you @sywhang
I made it work more or less as you describe some time ago, but I still need to use the cloned sources as some of the methods are not public.

Two questions:

  1. Is there any plan to make ther relevant methods of the reverse protocol (ResumeRuntime) public in the library so that we can avoid cloning/forking this one?
  2. Is there any plan to release a tool showing the assembly being loaded (or failed) from a .net 5 process? (I just wrote this one using this library)

@josalem
Copy link
Contributor

josalem commented Jan 27, 2021

  1. We're hoping to make a public API for that functionality eventually. The tools already have this functionality in them if the data you're looking for can be collected by dotnet-trace, dotnet-counters, etc.
  2. I'm not sure we will release a separate tool for just that purpose. You can collect loader events on startup using the --diagnostic-port option on dotnet-trace with the loader keyword. When you open that trace in PerfView or VS you should see all the loader events. We're looking at doing live and post-mortem reports of trace sessions that might report these things to the console but haven't committed to anything yet.

@raffaeler
Copy link
Author

Thank you @josalem but I need to use that functionality from my own tool, therefore making public the methods required to use the reverse protocol is fundamental.

With regards to dotnet-trace, I tried it as soon as .NET 5 was released, but I could not make it work. I will try again, but I defintely want integrate this functionality in my visual tool as the loading issues are too common.

@sywhang
Copy link
Contributor

sywhang commented Jan 27, 2021

Is there any plan to make ther relevant methods of the reverse protocol (ResumeRuntime) public in the library so that we can avoid cloning/forking this one?

As John mentioned, this is one of the work items that we've committed to working on. At the moment it's not the highest priority because we have some work items we are trying to complete for .NET 6, but we hope to get started with this some time soon, starting with the API design.

Is there any plan to release a tool showing the assembly being loaded (or failed) from a .net 5 process? (I just wrote this one using this library)

We are currently in process of implementing dotnet-trace monitor which will dump out the loader events to the console, which you should be able to use once it ships.

@raffaeler
Copy link
Author

Thank you for the good news @sywhang
The first point should be relatively quick to fix. I am currently running a fork of the DiagnosticClient successfully where I just had to make public a few methods. But maybe I am missing other stuff that can be useful as well.

With regard to dotnet-trace monitor, do you plan to integrate the functionality to the new dotnet-monitor (web server) as well?

@sywhang
Copy link
Contributor

sywhang commented Mar 21, 2021

Sorry this issue slipped out of my inbox. dotnet-monitor is in a separate repo now -- https://github.com/dotnet/dotnet-monitor so perhaps we can file feature request issue there. Currently we're quite busy trying to ship the planned feature into a first public release, but this can be something they can consider adding in subsequent releases of the tool.

@tommcdon tommcdon modified the milestones: 6.0.0, 7.0.0 Jun 21, 2021
@tommcdon tommcdon modified the milestones: 7.0.0, 8.0.0 Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants