-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel creation of Mutex
with initiallyOwned: true
can cause SIGSEGV
on Ubuntu 19.04
#34271
Comments
Mutex
with initialOwner: true
can cause SIGSEGV
on Ubuntu 19.04Mutex
with initiallyOwned: true
can cause SIGSEGV
on Ubuntu 19.04
SIGSEGV is converted into managed exception only if it happens in managed code or a couple of thin helpers that are executed on behalf of the managed code. At all other places, SIGSEGV represents a bug in the native runtime / external shared libraries and so we fail fast instead. Converting it to a managed exception would be dangerous as we don't know what state was the current thread in. For example, if the sigsegv happened in a code running under a lock, handling the sigsegv would soon result in a deadlock. |
I can confirm it repros on my Ubuntu 16.04 machine too. |
Thanks @jburger for excellent clear bug report. |
Below when I refer to "mutex" I'm referring to the underlying mutex object, not an instance of the `Mutex` class. - When the last reference to a mutex is closed while the lock is held by some thread and a pthread mutex is used, the mutex was attempted to be destroyed but that has undefined behavior - There doesn't seem to be a way to behave exactly like on Windows for this corner case, where the mutex is destroyed when the last reference to it is released, regardless of which process has the mutex locked and which process releases the last reference to it (they could be two different processes), including in cases of abrupt shutdown - For this corner case I settled on what seems like a decent solution and compatible with older runtimes: - When a process releases its last reference to the mutex - If that mutex is locked by the same thread, the lock is abandoned and the process no longer references the mutex - If that mutex is locked by a different thread, the lifetime of the mutex is extended with an implicit ref. The implicit ref prevents this or other processes from attempting to destroy the mutex while it is locked. The implicit ref is removed in either of these cases: - The mutex gets another reference from within the same process - The thread that owns the lock exits and abandons the mutex, at which point that would be the last reference to the mutex and the process would not reference the mutex anymore - The implementation based on file locks is less restricted, but for consistency that implementation also follows the same behavior - There was also a race between an exiting thread abandoning one of its locked named mutexes and another thread releasing the last reference to it, fixed by using the creation/deletion process lock to synchronize Fix for dotnet#34271 in master Closes dotnet#28449 - probably doesn't fix the issue, but trying to enable it to see if it continues to fail
Below when I refer to "mutex" I'm referring to the underlying mutex object, not an instance of the `Mutex` class. - When the last reference to a mutex is closed while the lock is held by some thread and a pthread mutex is used, the mutex was attempted to be destroyed but that has undefined behavior - There doesn't seem to be a way to behave exactly like on Windows for this corner case, where the mutex is destroyed when the last reference to it is released, regardless of which process has the mutex locked and which process releases the last reference to it (they could be two different processes), including in cases of abrupt shutdown - For this corner case I settled on what seems like a decent solution and compatible with older runtimes: - When a process releases its last reference to the mutex - If that mutex is locked by the same thread, the lock is abandoned and the process no longer references the mutex - If that mutex is locked by a different thread, the lifetime of the mutex is extended with an implicit ref. The implicit ref prevents this or other processes from attempting to destroy the mutex while it is locked. The implicit ref is removed in either of these cases: - The mutex gets another reference from within the same process - The thread that owns the lock exits and abandons the mutex, at which point that would be the last reference to the mutex and the process would not reference the mutex anymore - The implementation based on file locks is less restricted, but for consistency that implementation also follows the same behavior - There was also a race between an exiting thread abandoning one of its locked named mutexes and another thread releasing the last reference to it, fixed by using the creation/deletion process lock to synchronize Fixes dotnet/runtime#34271 in 3.1
Fix Unix named mutex crash during some race conditions Below when I refer to "mutex" I'm referring to the underlying mutex object, not an instance of the `Mutex` class. - When the last reference to a mutex is closed while the lock is held by some thread and a pthread mutex is used, the mutex was attempted to be destroyed but that has undefined behavior - There doesn't seem to be a way to behave exactly like on Windows for this corner case, where the mutex is destroyed when the last reference to it is released, regardless of which process has the mutex locked and which process releases the last reference to it (they could be two different processes), including in cases of abrupt shutdown - For this corner case I settled on what seems like a decent solution and compatible with older runtimes: - When a process releases its last reference to the mutex - If that mutex is locked by the same thread, the lock is abandoned and the process no longer references the mutex - If that mutex is locked by a different thread, the lifetime of the mutex is extended with an implicit ref. The implicit ref prevents this or other processes from attempting to destroy the mutex while it is locked. The implicit ref is removed in either of these cases: - The mutex gets another reference from within the same process - The thread that owns the lock exits and abandons the mutex, at which point that would be the last reference to the mutex and the process would not reference the mutex anymore - The implementation based on file locks is less restricted, but for consistency that implementation also follows the same behavior - There was also a race between an exiting thread abandoning one of its locked named mutexes and another thread releasing the last reference to it, fixed by using the creation/deletion process lock to synchronize Fix for #34271 in master Closes #28449 - probably doesn't fix the issue, but trying to enable it to see if it continues to fail
@jburger and @pawelpabich, could you please share some more information about how this was showing up originally and how the mutex was used? I see from the linked issue OctopusDeploy/Issues#6287 it was showing up while writing to logs, and the issue mentioned that a bug was fixed, I'm also curious if anything was changed to work around the problem and if you are still seeing the issue from time to time. |
@kouvel we were using a named mutex for locking concurrent writes to files based on the filename. public class NamedLocks
{
readonly Dictionary<string, RefCountedLock> refCountedLocks = new Dictionary<string, RefCountedLock>();
public IDisposable LockFor(string name)
{
RefCountedLock refCountedLock;
lock (refCountedLocks)
{
if (!refCountedLocks.TryGetValue(name, out refCountedLock))
{
refCountedLock = new RefCountedLock(name, refCountedLocks);
refCountedLocks[name] = refCountedLock;
}
refCountedLock.Acquire();
}
refCountedLock.Enter();
return refCountedLock;
}
public int Count()
{
lock (refCountedLocks)
{
return refCountedLocks.Count;
}
}
class RefCountedLock : IDisposable
{
readonly string name;
readonly Dictionary<string, RefCountedLock> refCountedLocks;
readonly ReaderWriterLockSlim @lock;
int numberOfRefs;
public RefCountedLock(string name, Dictionary<string, RefCountedLock> refCountedLocks)
{
this.name = name;
this.refCountedLocks = refCountedLocks;
@lock = new ReaderWriterLockSlim();
numberOfRefs = 0;
}
public void Acquire()
{
numberOfRefs++;
}
public void Enter()
{
@lock.EnterWriteLock();
}
public void Dispose()
{
lock (refCountedLocks)
{
numberOfRefs--;
if (numberOfRefs == 0)
{
refCountedLocks.Remove(name);
}
}
@lock.ExitWriteLock();
if (numberOfRefs == 0)
{
@lock.Dispose();
}
}
}
} |
I see, thanks @johnsimons. Do you still need the ability to share the same lock across other processes, or is it just for synchronization within one process? |
No, we only need it for the same process. It was just a convenience thing 😀 |
Fixed in 5.0 by #36268 |
Parallel creation of
System.Threading.Mutex
can fail wheninitiallyOwned
istrue
, this causes a segmentation fault inlibpthread.so
which appears to be handled inlibcoreclr.so
but no managed exceptions are thrown, and the process fails, aSIGABRT
is listed as the stop reason for thread #1.Steps to reproduce
netcoreapp3.1
console application with the followingProgram.cs
code:dotnet build -c Release
export COMPlus_DbgEnableMiniDump=1
./bin/Release/netcoreapp3.1/repro
initallyOwned: false
and observe that the code works OKExpected behaviour
A managed exception is thrown when invalid Mutex access is attempted.
lldb output
When analyzing the resulting coredump in lldb thread #1 shows something like this:
Seems like the coreclr SIGSEGV handler is being called, so I'm not sure why there is no managed exception.
OS Details
.NET details
Please let me know if there is any more information I can provide.
The text was updated successfully, but these errors were encountered: