Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-115999: Specialize LOAD_SUPER_ATTR in free-threaded builds #127128

Merged
merged 6 commits into from
Dec 3, 2024

Conversation

nascheme
Copy link
Member

@nascheme nascheme commented Nov 22, 2024

Use existing helpers to atomically modify the bytecode. Fix thread safety issue with cell_set_contents(). Add unit tests to ensure specializing is happening as expected.

Note: this needs gh-127272 to be merged as well in order to avoid thread safety issues.

Use existing helpers to atomically modify the bytecode.  Add unit tests
to ensure specializing is happening as expected.  Add test_specialize.py
that can be used with ThreadSanitizer to detect data races.
@nascheme nascheme marked this pull request as ready for review November 25, 2024 21:35
Copy link
Contributor

@mpage mpage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good to me, but is a significant regression on the free-threaded builds. I think the regression is coming from the critical section that is acquired in BEGIN_TYPE_LOCK before calling super_lookup_lock_held. It looks like some of the functions called by super_lookup_lock_held already acquire the type lock when necessary (e.g. _super_lookup_descr); we are forced to release and reacquire the critical section each time these functions are called. Do we need to hold the type lock before calling super_lookup_lock_held?

Edit: For completeness, this looks perf neutral on default builds.

@nascheme
Copy link
Member Author

It looks like some of the functions called by super_lookup_lock_held already acquire the type lock when necessary (e.g. _super_lookup_descr

I see. In that case, it seems do_super_lookup() is also safe. The other function called by _PySuper_Lookup() is supercheck(). I think that might be safe as well (the super_descr_get() slot function uses it without a critical section). If so, then I would conclude _PySuper_Lookup() doesn't need any changes.

It seems `do_super_lookup()` and `supercheck()` are safe to call and so
`_PySuper_Lookup()` doesn't need a lock for the type object.  This
avoids a performance regression in FT builds.
@nascheme
Copy link
Member Author

This is the worst regression on the benchmark suite:

richards_super 103 ms 1.82 sec: 17.69x slower

A new benchmark run is in-progress and it looks like that regression is gone (which would be expected since the code hasn't changed).

@mpage
Copy link
Contributor

mpage commented Nov 27, 2024

New benchmark results look good. Performance is neutral / slightly improved overall; the benchmarks that we expect this to help (e.g. richards_super) are clearly improved.

I think you need to fix the merge conflicts before CI will run tests.

Copy link
Contributor

@mpage mpage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@nascheme nascheme merged commit 0cb5222 into python:main Dec 3, 2024
58 checks passed
@nascheme nascheme deleted the gh-115999-load-super-attr branch December 3, 2024 17:32
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Jan 8, 2025
…pythongh-127128)

Use existing helpers to atomically modify the bytecode.  Add unit tests
to ensure specializing is happening as expected.  Add test_specialize.py
that can be used with ThreadSanitizer to detect data races.  
Fix thread safety issue with cell_set_contents().
ebonnal pushed a commit to ebonnal/cpython that referenced this pull request Jan 12, 2025
…pythongh-127128)

Use existing helpers to atomically modify the bytecode.  Add unit tests
to ensure specializing is happening as expected.  Add test_specialize.py
that can be used with ThreadSanitizer to detect data races.  
Fix thread safety issue with cell_set_contents().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants