Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid spinlocks #916

Closed
programmerjake opened this issue May 8, 2020 · 16 comments
Closed

Avoid spinlocks #916

programmerjake opened this issue May 8, 2020 · 16 comments

Comments

@programmerjake
Copy link
Contributor

programmerjake commented May 8, 2020

See https://matklad.github.io/2020/01/02/spinlocks-considered-harmful.html for some reasoning.

TLDR: spinlocks cause issues because the OS doesn't know when something is blocked, so can cause awful scheduling issues and big time delays. spinlocks are basically only appropriate for embedded or OS kernel software, and then you have to be really careful.

See also: #899

@davidhewitt
Copy link
Member

Thanks for that, that's a good article.

I think we should probably return to parking_lot::Mutex in this case. @m-ou-se as you contributed the PR which removed the parking_lot dependency, I just wanted to check this will not cause you big problems if I put this dependency back?

@m-ou-se
Copy link
Contributor

m-ou-se commented May 9, 2020

This crate did originally (2017-2020) use a spinlock, using the spin crate. The only reason a 'real' mutex (the one from parking_lot) was used starting four months ago, was because the spin crate was no longer maintained. See #734

@programmerjake Are you actually experiencing any problems with this?

Python isn't really usable with multiple threads anyway, with the global interpreter lock. So yes, you can send PyObjects to other threads and have them dropped there, but it seems like an unusual use case. Under normal circumstances, this lock is only used from one thread.

You could simply replace spin_loop_hint() inside the compare_and_swap loop by a call to std::thread::yield_now() if you're worried about the OS not being involved. But note that a regular Mutex also spins for a while before it even tries yielding or blocking through the OS, because that's what's usually most efficient.

@kngwyu
Copy link
Member

kngwyu commented May 12, 2020

I also don't think this is so problematic since spin_loop_hint is rarely called.

@IvanKuzavkov
Copy link

Hello, today we have moved in our project to the latest version 0.10, and it seems that spinlock dead lock is not so rare as you may expected, since we found it straight out after update - python process consumes 100% of CPU. I've attached sample of this process, and after remove spinlock from the project and returning parking_lot::Mutex eliminates this dead lock from our tests.

2185 pyo3::gil::GILPool::new::h000de5dc7faabf11  (in api.cpython-37m-darwin.so) + 37  [0x1068352f5]  gil.rs:268
2185 pyo3::gil::ReferencePool::update_counts::h939079ca1c222ef9  (in api.cpython-37m-darwin.so) + 471  [0x106ac6e57]  gil.rs:236
    2185 pyo3::ffi::object::Py_DECREF::h72d8b7dcba474470  (in api.cpython-37m-darwin.so) + 62  [0x106ab6f2e]  object.rs:931
    2185 subtype_dealloc  (in Python) + 948  [0x1007cadb4]
        2185 tupledealloc  (in Python) + 140  [0x1007c6d7c]
        2185 pyo3::pyclass::initialize_type_object::tp_dealloc_callback::hcdf589e42e67540a  (in api.cpython-37m-darwin.so) + 21  [0x10681b605]  pyclass.rs:116
            2185 pyo3::gil::GILPool::new::h000de5dc7faabf11  (in api.cpython-37m-darwin.so) + 37  [0x1068352f5]  gil.rs:268
            2185 pyo3::gil::ReferencePool::update_counts::h939079ca1c222ef9  (in api.cpython-37m-darwin.so) + 39  [0x106ac6ca7]  gil.rs:217
                1147 pyo3::gil::Lock::new::he4d1580ddcf3440e  (in api.cpython-37m-darwin.so) + 72  [0x106ac6b28]  gil.rs:183
                ! 1131 core::sync::atomic::spin_loop_hint::hb89cd0f38c76969c  (in api.cpython-37m-darwin.so) + 9  [0x106aaf909]  atomic.rs:149
                ! : 1106 core::hint::spin_loop::h569ab2c6e2269057  (in api.cpython-37m-darwin.so) + 9  [0x106ac5479]  hint.rs:76
                ! : | 1104 core::core_arch::x86::sse2::_mm_pause::h16d89359bf9d696b  (in api.cpython-37m-darwin.so) + 6,7  [0x106ac5c96,0x106ac5c97]  sse2.rs:26
                ! : | 2 core::core_arch::x86::sse2::_mm_pause::h16d89359bf9d696b  (in api.cpython-37m-darwin.so) + 4  [0x106ac5c94]  sse2.rs:25
                ! : 11 core::hint::spin_loop::h569ab2c6e2269057  (in api.cpython-37m-darwin.so) + 9,10  [0x106ac5479,0x106ac547a]  hint.rs:94
                ! : 7 core::hint::spin_loop::h569ab2c6e2269057  (in api.cpython-37m-darwin.so) + 0  [0x106ac5470]  hint.rs:64
                ! : 7 core::hint::spin_loop::h569ab2c6e2269057  (in api.cpython-37m-darwin.so) + 4  [0x106ac5474]  hint.rs:76
                ! 12 core::sync::atomic::spin_loop_hint::hb89cd0f38c76969c  (in api.cpython-37m-darwin.so) + 0  [0x106aaf900]  atomic.rs:148
                ! 4 core::sync::atomic::spin_loop_hint::hb89cd0f38c76969c  (in api.cpython-37m-darwin.so) + 9  [0x106aaf909]  atomic.rs:150
                945 pyo3::gil::Lock::new::he4d1580ddcf3440e  (in api.cpython-37m-darwin.so) + 37  [0x106ac6b05]  gil.rs:182
                ! 580 core::sync::atomic::AtomicBool::compare_and_swap::h4b83ea0e0f291cea  (in api.cpython-37m-darwin.so) + 116  [0x106aaf604]  atomic.rs:508
                ! : 383 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 114  [0x106aaf6c2]  atomic.rs:567
                ! : | 203 core::sync::atomic::atomic_compare_exchange::h4c1267f2944208c2  (in api.cpython-37m-darwin.so) + 329,175,...  [0x106aafa59,0x106aaf9bf,...]  atomic.rs:2331
                ! : | 60 core::sync::atomic::atomic_compare_exchange::h4c1267f2944208c2  (in api.cpython-37m-darwin.so) + 79,75,...  [0x106aaf95f,0x106aaf95b,...]  atomic.rs:0
                ! : | 42 core::sync::atomic::atomic_compare_exchange::h4c1267f2944208c2  (in api.cpython-37m-darwin.so) + 666,29,...  [0x106aafbaa,0x106aaf92d,...]  atomic.rs:2330
                ! : | 35 core::sync::atomic::atomic_compare_exchange::h4c1267f2944208c2  (in api.cpython-37m-darwin.so) + 677,688,...  [0x106aafbb5,0x106aafbc0,...]  atomic.rs:2344
                ! : | 17 core::sync::atomic::atomic_compare_exchange::h4c1267f2944208c2  (in api.cpython-37m-darwin.so) + 713,710  [0x106aafbd9,0x106aafbd6]  atomic.rs:2345
                ! : | 16 core::sync::atomic::atomic_compare_exchange::h4c1267f2944208c2  (in api.cpython-37m-darwin.so) + 4,1,...  [0x106aaf914,0x106aaf911,...]  atomic.rs:2323
                ! : | 10 core::sync::atomic::atomic_compare_exchange::h4c1267f2944208c2  (in api.cpython-37m-darwin.so) + 315  [0x106aafa4b]  mod.rs:0
                ! : 62 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 74,106,...  [0x106aaf69a,0x106aaf6ba,...]  atomic.rs:567
                ! : 46 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 27,20,...  [0x106aaf66b,0x106aaf664,...]  atomic.rs:558
                ! : 43 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 149,146,...  [0x106aaf6e5,0x106aaf6e2,...]  atomic.rs:570
                ! : 23 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 53  [0x106aaf685]  atomic.rs:567
                ! : | 13 core::cell::UnsafeCell$LT$T$GT$::get::h13c66364639181ec  (in api.cpython-37m-darwin.so) + 5,0,...  [0x106ab1965,0x106ab1960,...]  cell.rs:1638
                ! : | 10 core::cell::UnsafeCell$LT$T$GT$::get::h13c66364639181ec  (in api.cpython-37m-darwin.so) + 12,16,...  [0x106ab196c,0x106ab1970,...]  cell.rs:1643
                ! : 16 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 183  [0x106aaf707]  atomic.rs:572
                ! : 6 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 131  [0x106aaf6d3]  atomic.rs:569
                ! : 1 core::sync::atomic::AtomicBool::compare_exchange::h8b1e175e32d46cc9  (in api.cpython-37m-darwin.so) + 153  [0x106aaf6e9]  atomic.rs:566
                ! 160 core::sync::atomic::AtomicBool::compare_and_swap::h4b83ea0e0f291cea  (in api.cpython-37m-darwin.so) + 56  [0x106aaf5c8]  atomic.rs:508
                ! : 117 core::sync::atomic::strongest_failure_ordering::h15e15cc96dffecda  (in api.cpython-37m-darwin.so) + 33,37,...  [0x106aafc21,0x106aafc25,...]  atomic.rs:0
                ! : 13 core::sync::atomic::strongest_failure_ordering::h15e15cc96dffecda  (in api.cpython-37m-darwin.so) + 68  [0x106aafc44]  atomic.rs:2256
                ! : 12 core::sync::atomic::strongest_failure_ordering::h15e15cc96dffecda  (in api.cpython-37m-darwin.so) + 8,0,...  [0x106aafc08,0x106aafc00,...]  atomic.rs:2251
                ! : 12 core::sync::atomic::strongest_failure_ordering::h15e15cc96dffecda  (in api.cpython-37m-darwin.so) + 16,12  [0x106aafc10,0x106aafc0c]  atomic.rs:2253
                ! : 6 core::sync::atomic::strongest_failure_ordering::h15e15cc96dffecda  (in api.cpython-37m-darwin.so) + 80  [0x106aafc50]  atomic.rs:2259
                ! 89 core::sync::atomic::AtomicBool::compare_and_swap::h4b83ea0e0f291cea  (in api.cpython-37m-darwin.so) + 65,44,...  [0x106aaf5d1,0x106aaf5bc,...]  atomic.rs:508
                ! 37 core::sync::atomic::AtomicBool::compare_and_swap::h4b83ea0e0f291cea  (in api.cpython-37m-darwin.so) + 140,146,...  [0x106aaf61c,0x106aaf622,...]  atomic.rs:510
                ! 37 core::sync::atomic::AtomicBool::compare_and_swap::h4b83ea0e0f291cea  (in api.cpython-37m-darwin.so) + 174,176,...  [0x106aaf63e,0x106aaf640,...]  atomic.rs:512
                ! 35 core::sync::atomic::AtomicBool::compare_and_swap::h4b83ea0e0f291cea  (in api.cpython-37m-darwin.so) + 20,8,...  [0x106aaf5a4,0x106aaf598,...]  atomic.rs:507
                ! 7 core::sync::atomic::AtomicBool::compare_and_swap::h4b83ea0e0f291cea  (in api.cpython-37m-darwin.so) + 133  [0x106aaf615]  atomic.rs:509
                60 pyo3::gil::Lock::new::he4d1580ddcf3440e  (in api.cpython-37m-darwin.so) + 43,22,...  [0x106ac6b0b,0x106ac6af6,...]  gil.rs:182
                13 pyo3::gil::Lock::new::he4d1580ddcf3440e  (in api.cpython-37m-darwin.so) + 67  [0x106ac6b23]  gil.rs:183
                11 pyo3::gil::Lock::new::he4d1580ddcf3440e  (in api.cpython-37m-darwin.so) + 16  [0x106ac6af0]  gil.rs:181
                9 pyo3::gil::Lock::new::he4d1580ddcf3440e  (in api.cpython-37m-darwin.so) + 40  [0x106ac6b08]  gil.rs:0

@kngwyu
Copy link
Member

kngwyu commented May 13, 2020

@IvanKuzavkov
Thank you for reporting!
We revert the change and will release 0.10.1 soon.

@IvanKuzavkov
Copy link

@kngwyu thanks.

@kngwyu
Copy link
Member

kngwyu commented May 13, 2020

Hmm but deadlock sounds weird... 🤔
@IvanKuzavkov
Could you please let us know the change you did to avoid deadlock? Just replace Lock with parking_lot::Mutex?
Or you reverted #899 entirely?

@IvanKuzavkov
Copy link

here is the diff:
1.txt

@davidhewitt
Copy link
Member

That diff is not the same behaviour, as the locks are held for much shorter duration, and notably not even during the Py_DECREF - so that diff is not a safe solution. To be clear I don't think that patch is the right fix.

I think the issue looks related to PyObject drop occuring when pyo3 incorrectly thinks the GIL is not held. This would be the cause of a deadlock.

@IvanKuzavkov are you able to provide a minimum reproduction of the test which causes problems for you on 0.10.0 please?

@davidhewitt
Copy link
Member

EDIT: Ahh I think I see why this deadlock is occurring. The problem is this line being called during drop when the current thread already is already in an update_counts step. Fix is probably to have TLS to check if this thread is already holding the lock.

I'm suprised no tests flagged this. I will write a patch fix and tests tonight (~ 8 hrs time). If someone else wants this fixed sooner, please take this on.

@kngwyu
Copy link
Member

kngwyu commented May 13, 2020

Fix is probably to have TLS to check if this thread is already holding the lock.

How about using swap(=original solution) or Vec::drain Vec::split_off? (EDITED: I'm sorry drain would not work).
E.g.,

        {
            let v = self.pointers_to_incref.lock().split_off(0);
            for ptr in v {
                unsafe { ffi::Py_INCREF(ptr.as_ptr()); }
            }
        }

@IvanKuzavkov
Copy link

@dvidhewitt hello, I think providing good example can be a thing, but I can implement right fix, if it's necessary. Also if using spinlock requires additional recursive lock check, maybe rollback to mutex solution isn't a bad idea, to traid some performance to code stability?

@davidhewitt
Copy link
Member

How about using swap(=original solution) or Vec::drain?

Yeah, a good fix would be to get the lock, swap the two vectors for empty ones (similar to how you use split_off for OWNED_OBJECTS), and then release the lock.

Also if using spinlock requires additional recursive lock check, maybe rollback to mutex solution isn't a bad idea

It's not spinlock fault, you are holding the Mutex for much less time in the diff you provided. Mutex implemented safely would also have the same deadlock.

@IvanKuzavkov
Copy link

It's not spinlock fault, you are holding the Mutex for much less time in the diff you provided. Mutex implemented safely would also have the same deadlock.

Yeah, you right, holding parking_lot::Mutex for entire operation block caused deadlock as well, using parking_lot::ReentrantMutex works well.

@davidhewitt
Copy link
Member

FWIW, after consideration in #924 we decided to use a parking_lot::Mutex instead of spinlock for a simpler solution and less unsafe code. (pyO3 has too much unsafe already 🤣 )

@m-ou-se appreciate you may not be happy to see the additional dependency reintroduced. Willing to work to introduce a pyo3_ffi crate if that would help in your case.

@m-ou-se
Copy link
Contributor

m-ou-se commented May 15, 2020

@davidhewitt No worries, looks like this is the best solution right now. :) Happy this problem was found and fixed so quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants