Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: collect stack dump from containerd #3918

Closed
samuelkarp opened this issue Feb 4, 2025 · 7 comments · Fixed by #3926
Closed

Feature request: collect stack dump from containerd #3918

samuelkarp opened this issue Feb 4, 2025 · 7 comments · Fixed by #3926

Comments

@samuelkarp
Copy link
Contributor

containerd can produce a stack dump, which is helpful when troubleshooting issues that may turn out to be bugs in containerd. The stack dump can be triggered by sending SIGUSR1 to the main containerd daemon, and the dump is then both written to the journal (via containerd's normal logger) and to a file in /tmp named containerd.<pid>.stacks.log (with <pid> being the PID of containerd).

It would be helpful if the containerd plugin could trigger the stack dump and collect them.

@TurboTurtle
Copy link
Member

Certainly sounds reasonable. I can work something up this week most likely, unless you're wanting to do this yourself and opening this issue just for tracking.

@samuelkarp
Copy link
Contributor Author

I'm not sure I'll have bandwidth soon (and Python is not a language I'm super familiar with) so I'd certainly appreciate it if you made the change. Thanks!

TurboTurtle added a commit to TurboTurtle/sos that referenced this issue Feb 11, 2025
Adds a new `stackdump` option that, if enabled, will send SIGUSR1 to
the root containerd process(es) to trigger writing of stack dump logs by
the daemon, then marks those logs for collection.

Resolves: sosreport#3918

Signed-off-by: Jake Hunsaker <[email protected]>
@TurboTurtle
Copy link
Member

@samuelkarp I just pushed #3926 for this, I think it covers the request but please take a look when you have a moment and let us know if there is anything missing.

@samuelkarp
Copy link
Contributor Author

I think that looks like it would work. To invoke a user would need to pass --plugin-option containerd.stackdump? The only other bit is if we also wanted to have stack dumps from the shims (containerd-shim-runc-v2) which would just be written to the journal, but that seems a more rare use-case.

@TurboTurtle
Copy link
Member

To invoke a user would need to pass --plugin-option containerd.stackdump?

Yup.

The only other bit is if we also wanted to have stack dumps from the shims (containerd-shim-runc-v2)

That was actually what prompted me to make the changes to get_process_pids(), so we weren't signalling those processes 🙃

@samuelkarp
Copy link
Contributor Author

That was actually what prompted me to make the changes to get_process_pids(), so we weren't signalling those processes 🙃

That's still probably valuable; getting a stack dump from each shim would be fairly noisy and only useful in rare cases. Maybe if we put that behind a second option that might be better than having everything behind containerd.stackdump?

TurboTurtle added a commit to TurboTurtle/sos that referenced this issue Feb 12, 2025
Adds a new `stackdump` option that, if enabled, will send SIGUSR1 to
the root containerd process(es) to trigger writing of stack dump logs by
the daemon, then marks those logs for collection.

Resolves: sosreport#3918

Signed-off-by: Jake Hunsaker <[email protected]>
TurboTurtle added a commit that referenced this issue Feb 14, 2025
Adds a new `stackdump` option that, if enabled, will send SIGUSR1 to
the root containerd process(es) to trigger writing of stack dump logs by
the daemon, then marks those logs for collection.

Resolves: #3918

Signed-off-by: Jake Hunsaker <[email protected]>
@samuelkarp
Copy link
Contributor Author

@TurboTurtle Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants