Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in finalization with LibGpiodDriver #1849

Closed
rtoijala opened this issue Apr 14, 2022 · 8 comments · Fixed by #2311
Closed

Segfault in finalization with LibGpiodDriver #1849

rtoijala opened this issue Apr 14, 2022 · 8 comments · Fixed by #2311
Labels
bug Something isn't working Priority:1 Work that is critical for the release, but we could probably ship without Status: In PR

Comments

@rtoijala
Copy link

Describe the bug

Leaking a GpioController that uses LibGpiodDriver and has a pin open can segfault due to the pin being released after the chip has been destroyed. This appears to be a race condition and does not occur always.

GpioController creates a SafeChipHandle for the chip and a SafeLineHandle for the pin/line. Destroying the chip invalidates the line, since libgpiod frees the line's memory when destroying the chip. I suspect that the reason for the crash is that during finalization, the runtime does not know that the chip should be destroyed last (or that the line should not be freed if the chip has already been freed).

Steps to reproduce

The code below reproduces the segfault within a few tries.

GpioController controller1 = new GpioController(PinNumberingScheme.Logical, new LibGpiodDriver(4));
controller1.OpenPin(10, PinMode.Output);
GpioController controller2 = new GpioController(PinNumberingScheme.Logical, new LibGpiodDriver(4));
controller2.OpenPin(11, PinMode.Output);
GpioController controller3 = new GpioController(PinNumberingScheme.Logical, new LibGpiodDriver(4));
controller3.OpenPin(12, PinMode.Output);
GpioController controller4 = new GpioController(PinNumberingScheme.Logical, new LibGpiodDriver(4));
controller4.OpenPin(13, PinMode.Output);
GpioController controller5 = new GpioController(PinNumberingScheme.Logical, new LibGpiodDriver(4));
controller5.OpenPin(14, PinMode.Output);

for (int i = 0; i < 10; i++)
{
    GC.Collect();
    GpioController controller6 = new GpioController(PinNumberingScheme.Logical, new LibGpiodDriver(4));
    controller6.OpenPin(15, PinMode.Output);
    controller6.ClosePin(15);
    controller6.Dispose();
    GC.Collect();
    Thread.Sleep(20);
}

Expected behavior

The GPIO chip and line/pin are destroyed/released correctly. No segfault.

Actual behavior

Segfault.

See the attached log, generated with GDB. Unfortunately I do not have debug symbols for dotnet itself, but the stack trace for libgpiod is visible.

To summarize, what happens is:

  1. The pins 10, 11, 12, 13, and 14 are opened (gpiod_chip_get_line)
  2. The leaked chip controller2 is destroyed, which also frees the line struct at 0xaaaaaae0dbb0.
  3. The chip controller6 is destroyed. This is just noise.
  4. The line struct at 0xaaaaaae0dbb0 from controller2 is released again, causing a crash because the memory has already been reused and overwritten.

crash.txt

Versions used

dotnet --info on the build machine (self-contained build for linux-arm64)

.NET SDK (reflecting any global.json):
 Version:   5.0.407
 Commit:    1a1785612e

Runtime Environment:
 OS Name:     ubuntu
 OS Version:  20.04
 OS Platform: Linux
 RID:         ubuntu.20.04-x64
 Base Path:   /home/user/dotnet/sdk/5.0.407/

Host (useful for support):
  Version: 5.0.16
  Commit:  1016676966

.NET SDKs installed:
  5.0.407 [/home/user/dotnet/sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 5.0.16 [/home/user/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 5.0.16 [/home/user/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET runtimes or SDKs:
  https://aka.ms/dotnet-download

System.Device.Gpio version 2.1.0

@rtoijala rtoijala added the bug Something isn't working label Apr 14, 2022
@ghost ghost added the untriaged label Apr 14, 2022
@pgrawehr
Copy link
Contributor

Thanks for reporting this. If I understand correctly, the problem does not occur if the instances are not leaked?

@rtoijala
Copy link
Author

That is correct. I have not seen the crash without leaked instances.

@joperezr joperezr self-assigned this Apr 21, 2022
@krwq krwq added Priority:0 Work that we can't release without and removed untriaged labels Apr 28, 2022
@krwq
Copy link
Member

krwq commented Apr 28, 2022

[Triage] @joperezr ping 😄

@joperezr
Copy link
Member

joperezr commented May 2, 2022

I probably won't have time to work on this this week.

@krwq krwq added Priority:1 Work that is critical for the release, but we could probably ship without and removed Priority:0 Work that we can't release without labels Dec 15, 2022
@krwq
Copy link
Member

krwq commented Apr 11, 2024

[Triage] we need to verify if this is an external issue and based on findings decide if we want to add any mitigations on our side while we wait on the fix. We also should verify if this is still an issue at all as they could've fixed it already.

@rtoijala
Copy link
Author

Based on my understanding of the libgpiod v1 documentation, I believe this is not a bug in libgpiod. gpiod_chip_close is documented as Close a GPIO chip handle and release all allocated resources.. "All allocated resources" implies that also the line objects are released. That is also what the implementation does, see https://github.com/brgl/libgpiod/blob/12bd0088e0f4fcba00259e3991eeb7c83d1e7823/lib/core.c#L249.

This means that System.Device.Gpio should not try to release any line after its corresponding chip has been closed, since they will have been already freed.

Please note that I no longer have the hardware for reproducing this issue, and I also have not looked at the System.Device.Gpio code since I created this bug report. I have no idea whether the crash occurs anymore.

@pgrawehr
Copy link
Contributor

@rtoijala Thanks for finding this reference, that was exactly what I was assuming but was unable to find a confirmation for. That likely means my proposed fix was actually right. I need to check that and update it once.

@pgrawehr
Copy link
Contributor

Not sure where the "official" documentation is, but https://www.lane-fu.com/linuxmirror/libgpiod/doc/html/group______chips____.html seems pretty much to be the reference.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working Priority:1 Work that is critical for the release, but we could probably ship without Status: In PR
Projects
None yet
4 participants