Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java: async-profiler: use dump command for truly continuous profiling #569

Open
Jongy opened this issue Nov 1, 2022 · 2 comments
Open
Assignees
Labels
defined-and-prioritized Tickets that have fully defined the desired outcome & are prioritized to be developed. enhancement New feature or request runtime/java

Comments

@Jongy
Copy link
Contributor

Jongy commented Nov 1, 2022

We profile with async-profiler by issuing start followed by stop (which stops profiling and dumps the collected traces).

Since async-profiler/async-profiler@868bfec (1~ year) there is a dump command which dumps the collected traces WITHOUT stopping. I was under the impression that it does stop, but realized now during routine reading of the code that it doesn't.

This has several benefits:

  1. More accurate profiling - we are always profiling and not intermittently which may introduce skewing.
  2. Possibly better performance - async-profiler needn't perform repeated init/cleanup work due to repeated start/stop (that I assume, didn't verify though).

It's similar to the change we did with perf here.

What I propose is, we use the dump command, and upon next iterations, the JavaProfiler will keep the list of "enabled" processes (we can keep the AsyncProfiledProcess itself) and instead of re-starting, we can only perform the status command to verify AP is truly running (or just skip it and trust AP to be running), wait the interval then issue another dump.
We will need to change the cleanup code - currently there is no cleanup done outside of the scope of _profile_ap_process. I figure we could do something similar to SystemProfiler.stop, which stops ongoing perf.map generation in running NodeJS processes in attach-maps mode - we can stop all active APs in JavaProfiler.stop (similarly to clean_up_node_maps called from SystemProfiler.stop).

Please add relevant tests that use the continuous feature over a few snapshot cycles of gProfiler, and prove that data is sent produced properly each time.

@Jongy Jongy added enhancement New feature or request runtime/java labels Nov 1, 2022
@Jongy Jongy added the defined-and-prioritized Tickets that have fully defined the desired outcome & are prioritized to be developed. label Jan 31, 2023
@Jongy
Copy link
Contributor Author

Jongy commented Feb 4, 2023

Note: we use the timeout parameter of AP to ensure it won't remain running forever, in case gProfiler itself gets killed / fails to stop AP.
If we move to continuous mode, we'll need to ensure AP will not remain running forever, consuming CPU & memory from the target process, if gProfiler is not alive. One option I can think of, is making timeout refresh itself every time we run the dump command. Then, if gProfiler dies / fails to dump, AP will stop itself after a short while. It might already be the behavior in AP code (regarding timeouts & dump), I didn't check

@marcin-ol
Copy link
Contributor

Timeout timer doesn't refresh itself. Async-profiler sets it in two cases:

  • regular timeout,
  • and loop mode.
    I thought for a while we could reuse loop mode, but that's designed to stop()/ start() at the end of each iteration.

It will probably be necessary to extend handling of timeout also for the dump action. In that case the timer mechanism might need work as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defined-and-prioritized Tickets that have fully defined the desired outcome & are prioritized to be developed. enhancement New feature or request runtime/java
Projects
None yet
Development

No branches or pull requests

2 participants