-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-108724: Add PyMutex and _PyParkingLot APIs #109344
Conversation
`PyMutex` is a one byte lock with fast, inlineable lock and unlock functions for the common uncontended case. The design is based on WebKit's `WTF::Lock`. PyMutex is built using the `_PyParkingLot` APIs, which provides a cross-platform futex-like API (based on WebKit's `WTF::ParkingLot`). This internal API will be used for building other synchronization primitives used to implement PEP 703, such as one-time initialization and events.
🤖 New build scheduled with the buildbot fleet by @colesbury for commit 3161e17 🤖 If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
!buildbot ARM Raspbian |
🤖 New build scheduled with the buildbot fleet by @colesbury for commit f963fd8 🤖 The command will test the builders whose names match following regular expression: The builders matched are:
|
!buildbot wasm |
🤖 New build scheduled with the buildbot fleet by @colesbury for commit f963fd8 🤖 The command will test the builders whose names match following regular expression: The builders matched are:
|
The lock tests were being picked up twice: once in test_lock.py and once in Test_testinternalcapi from test_misc.py. The Test_testinternalcapi was not skipping tests when the platform doesn't have threads. This moves the tests to test_misc.py, doesn't include them in Test_testinternalcapi, and skips them if the platform doesn't support threads.
!buildbot wasm |
🤖 New build scheduled with the buildbot fleet by @colesbury for commit 779a401 🤖 The command will test the builders whose names match following regular expression: The builders matched are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the first portion of my review, to get you started. 😄
(FWIW, there should probably be a PEP before making a public API out of this, but that doesn't affect this PR.)
// the mutex is unlocked. If the current thread holds the GIL, then the GIL | ||
// will be released while the thread is parked. | ||
static inline void | ||
PyMutex_Lock(PyMutex *m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to recommend using "acquire"/"release" instead of "lock"/"unlock", as the former is quite prevalent in this context (and, to my brain, "lock" evokes the object rather than the action). However, there are definitely examples of "lock"/"unlock".
Ultimately, it would be nice if the names were consistent everywhere, to avoid confusing readers. (It would also be nice if the naming were consistent in the industry and academia.) The closer we can get, the better. I'd rather this PR help us converge than diverge, but I haven't taken the time to decide which way the PR goes.😄 (My initial gut reaction was that it diverges.) Feedback on this from a couple of other core devs would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lock/unlock is much more common in libraries and programming languages:
- pthread_mutex_t (POSIX)
- std::mutex (C++)
- std::sync::mutex (Rust)
- java.util.concurrent.locks.Lock (Java)
- sync.Mutex (Go)
Windows tends to use "wait"/"release" terminology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I'm fine with lock/unlock (vs. acquire/release) at this point.
Regardless, I don't see this as a blocker for the PR, since it isn't public API (yet--and the name would get more scrutiny during the PEP process).
Release the GIL if `detach` is `1` and there is a current thread state. It's the caller's responsibility to make sure that if there is a valid thread state (and `detach` is 1), then the current thread holds the GIL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your continued review. Here's a summary of the major changes:
- I removed the thread-local data from
parking_lot.c
. It wasn't strictly necessary and didn't actually improve performance. - I switched to statically initializing the
buckets
linked lists inparking_lot.c
, which avoids needing to check if it's initialized inenqueue
/dequeue
. - I moved the
_PySemaphore
struct definition topycore_semaphore.h
because it now includes platform-specific details and I'd like to avoid exposing those inpycore_parking_lot.h
(I expect more files to use the parking lot API than the_PySemaphore
API.) - I switched
_PyParkingLot_Unpark
to use a function pointer as a callback, like you asked about in the previous review. I think this ends up being a bit nicer, but let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the last of my initial review. 😄
Thanks again for doing the work (and for putting up with my many comments)!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ericsnowcurrently. I've updated the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, thanks for your work on this and for addressing all my concerns. |
I'm going to wait until tomorrow, to accommodate timezones. 😄 |
Thanks again, @colesbury! |
|
|
@colesbury ^^^ |
|
|
The AMD64 Ubuntu Shared 3.x failure is due to an unrelated flaky test. |
|
|
The WASM buildbot failures are fixed by gh-109583. |
PyMutex is a one byte lock with fast, inlineable lock and unlock functions for the common uncontended case. The design is based on WebKit's WTF::Lock. PyMutex is built using the _PyParkingLot APIs, which provides a cross-platform futex-like API (based on WebKit's WTF::ParkingLot). This internal API will be used for building other synchronization primitives used to implement PEP 703, such as one-time initialization and events. This also includes tests and a mini benchmark in Tools/lockbench/lockbench.py to compare with the existing PyThread_type_lock. Uncontended acquisition + release: * Linux (x86-64): PyMutex: 11 ns, PyThread_type_lock: 44 ns * macOS (arm64): PyMutex: 13 ns, PyThread_type_lock: 18 ns * Windows (x86-64): PyMutex: 13 ns, PyThread_type_lock: 38 ns PR Overview: The primary purpose of this PR is to implement PyMutex, but there are a number of support pieces (described below). * PyMutex: A 1-byte lock that doesn't require memory allocation to initialize and is generally faster than the existing PyThread_type_lock. The API is internal only for now. * _PyParking_Lot: A futex-like API based on the API of the same name in WebKit. Used to implement PyMutex. * _PyRawMutex: A word sized lock used to implement _PyParking_Lot. * PyEvent: A one time event. This was used a bunch in the "nogil" fork and is useful for testing the PyMutex implementation, so I've included it as part of the PR. * pycore_llist.h: Defines common operations on doubly-linked list. Not strictly necessary (could do the list operations manually), but they come up frequently in the "nogil" fork. ( Similar to https://man.freebsd.org/cgi/man.cgi?queue) --------- Co-authored-by: Eric Snow <[email protected]>
This change introduced new compiler warnings: see issue gh-110014. |
PyMutex
is a one byte lock with fast, inlineable lock and unlock functions for the common uncontended case. The design is based on WebKit'sWTF::Lock
.PyMutex is built using the
_PyParkingLot
APIs, which provides a cross-platform futex-like API (based on WebKit'sWTF::ParkingLot
). This internal API will be used for building other synchronization primitives used to implement PEP 703, such as one-time initialization and events.This also includes tests and a mini benchmark in
Tools/lockbench/lockbench.py
to compare with the existingPyThread_type_lock
.Uncontended acquisition + release:
Linux (x86-64): PyMutex: 11 ns, PyThread_type_lock: 44 ns
macOS (arm64): PyMutex: 13 ns, PyThread_type_lock: 18 ns
Windows (x86-64): PyMutex: 13 ns, PyThread_type_lock: 38 ns
PR Overview
The primary purpose of this PR is to implement
PyMutex
, but there are a number of support pieces (described below).PyMutex
: A 1-byte lock that doesn't require memory allocation to initialize and is generally faster than the existingPyThread_type_lock
. The API is internal only for now._PyParking_Lot
: A futex-like API based on the API of the same name in WebKit. Used to implementPyMutex
._PyRawMutex
: A word sized lock used to implement_PyParking_Lot
PyEvent
: a one time event. This was used a bunch in the "nogil" fork and is useful for testing thePyMutex
implementation, so I've included it as part of the PR.pycore_llist.h
: Defines common operations on doubly-linked list. Not strictly necessary (could do the list operations manually), but they come up frequently in the "nogil" fork. (Similar to https://man.freebsd.org/cgi/man.cgi?queue)Linked issue