Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-125286: Share the Main Refchain With Legacy Interpreters #125709

Merged

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented Oct 18, 2024

They used to be shared, before 3.12. Returning to sharing them resolves a failure on Py_TRACE_REFS builds.

@nascheme
Copy link
Member

Rather than open-code the Py_RTFLAGS_USE_MAIN_OBMALLOC flag test in multiple places, I think a helper function is better, something like:

#ifdef Py_TRACE_REFS
// Return true if interpreter should have it's own refchain table.  Otherwise
// the refchain from the main interpreter is shared.
static inline bool
has_own_refchain(PyInterpreterState *interp)
{
    return (_Py_IsMainInterpreter(interp) ||
            !(interp->feature_flags & Py_RTFLAGS_USE_MAIN_OBMALLOC));
}
#endif

I was working on a similar patch but my approach was to copy the REFCHAIN pointer, e.g.

PyStatus
_PyObject_InitState(PyInterpreterState *interp)
{
#ifdef Py_TRACE_REFS
    if (has_own_refchain(interp)) {
        _Py_hashtable_allocator_t alloc = {
            // Don't use default PyMem_Malloc() and PyMem_Free() which
            // require the caller to hold the GIL.
            .malloc = PyMem_RawMalloc,
            .free = PyMem_RawFree,
        };
        REFCHAIN(interp) = _Py_hashtable_new_full(
            _Py_hashtable_hash_ptr, _Py_hashtable_compare_direct,
            NULL, NULL, &alloc);
        if (REFCHAIN(interp) == NULL) {
            return _PyStatus_NO_MEMORY();
        }
    }
    else {
        PyInterpreterState *main_interp = _PyInterpreterState_Main();
        REFCHAIN(interp) = main_interp->object_state.refchain;
    }

#endif
    return _PyStatus_OK();
}

void
_PyObject_FiniState(PyInterpreterState *interp)
{
#ifdef Py_TRACE_REFS
    if (has_own_refchain(interp)) {
        _Py_hashtable_destroy(REFCHAIN(interp));
    }
    REFCHAIN(interp) = NULL;
#endif
}

Is _Py_NormalizeImmortalReference still required now? I would think it can be removed. Overall I think this is a good approach.

@ericsnowcurrently
Copy link
Member Author

I think a helper function is better

+1 I'll do that.

my approach was to copy the REFCHAIN pointer

I'm unclear on how that's different from what I've done.

Is _Py_NormalizeImmortalReference still required now? I would think it can be removed.

I left it for the sake of the sanity check. Given the lingering pain in this area, especially related to immortal objects, I figured it might not hurt to preserve such checks.

Overall I think this is a good approach.

Thanks! Sorry if we ended up duplicating effort. I should have checked with you first.

@nascheme
Copy link
Member

my approach was to copy the REFCHAIN pointer

I'm unclear on how that's different from what I've done.

I think it's a bit less code and would be a bit faster, since you can just use REFCHAIN() everywhere aside from the init/fini function. Since TraceRefs is has huge overhead anyhow I think it doesn't matter so it's just a code style decision. Your way works and you could argue that having the condition everytime you do something with the tracerefs is more clear.

Is _Py_NormalizeImmortalReference still required now? I would think it can be removed.

I left it for the sake of the sanity check. Given the lingering pain in this area, especially related to immortal objects, I figured it might not hurt to preserve such checks.

Hmm, I think it's a bit confusing to leave it since it really shouldn't be doing anything useful. Or, is there a reason we think it might do something? The second part of the function shouldn't do anything. If Py_RTFLAGS_USE_MAIN_OBMALLOC is set then both interpreters are using the same refchain hash table and you are adding/removing to the same table.

Overall I think this is a good approach.

Thanks! Sorry if we ended up duplicating effort. I should have checked with you first.

Your code runs without crashing, which is a plus. ;-) I'm glad you made a PR.

@nascheme
Copy link
Member

Oh, I missed this part of your code change:

REFCHAIN(interp) = REFCHAIN(main_interp);

That explains why you're unclear how my code is different. But if you do this in the init, I don't think the check for Py_RTFLAGS_USE_MAIN_OBMALLOC is needed elsewhere, you can just use REFCHAIN(interp) when you need to add or remove objects to the hash table.

@ericsnowcurrently
Copy link
Member Author

I don't think the check for Py_RTFLAGS_USE_MAIN_OBMALLOC is needed elsewhere, you can just use REFCHAIN(interp) when you need to add or remove objects to the hash table.

Good point. I'll remove the superfluous calls.

@ericsnowcurrently
Copy link
Member Author

Also, you're right about _Py_NormalizeImmortalReference(). I'll drop it.

@ericsnowcurrently
Copy link
Member Author

I realized the problem with dropping all the extra has_own_refchain() sites. Here are the key steps of what we do currently:

  1. Py_NewInterpreterFromConfig(): calls _PyInterpreterState_New()
  2. _PyInterpreterState_New(): calls _PyObject_InitState()
  3. _PyObject_InitState(): sets the interpreter's refchain depending on interp->feature_flags
  4. _PyInterpreterState_New(): calls other initializers that create objects, which add to the refchain
  5. Py_NewInterpreterFromConfig(): sets interp->feature_flags

At step (3) we rely on the feature flags being set, but we don't set them until step (5).

Fixing this is a bit messy and may be too much to backport, so I'd like to merge this fix with the various has_own_refchain() calls in place (sort of), and then tackle the rest in a separate PR. It's not that simple, but you'll see what I mean.

@ericsnowcurrently
Copy link
Member Author

It wasn't as bad as I thought, so I believe I've got things in a good place here.

@ericsnowcurrently
Copy link
Member Author

!buildbot AMD64 Arch Linux TraceRefs

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ericsnowcurrently for commit 7ab086c 🤖

The command will test the builders whose names match following regular expression: AMD64 Arch Linux TraceRefs

The builders matched are:

  • AMD64 Arch Linux TraceRefs PR

@encukou
Copy link
Member

encukou commented Oct 22, 2024

I think this change should have a What's New entry, which in turn needs documentation for sys.getobjects: ericsnowcurrently#39

@ericsnowcurrently
Copy link
Member Author

I'm fine with that.

@ericsnowcurrently
Copy link
Member Author

!buildbot AMD64 Arch Linux TraceRefs

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ericsnowcurrently for commit 1360d86 🤖

The command will test the builders whose names match following regular expression: AMD64 Arch Linux TraceRefs

The builders matched are:

  • AMD64 Arch Linux TraceRefs PR

@encukou
Copy link
Member

encukou commented Oct 23, 2024

One more doc fix in a place where I couldn't add a GitHub suggestion. I hope you don't mind me pushing this directly to the PR.

The AMD64 Arch Linux TraceRefs PR buildbot run succeded: https://buildbot.python.org/#/builders/82/builds/1457

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thank you @nascheme & @ericsnowcurrently!

@ericsnowcurrently ericsnowcurrently merged commit 6f26d49 into python:main Oct 23, 2024
40 checks passed
@ericsnowcurrently ericsnowcurrently deleted the legacy-shared-refchain branch October 23, 2024 16:10
@ericsnowcurrently
Copy link
Member Author

@nascheme, this should be backported to 3.13 (and 3.12), right?

@nascheme
Copy link
Member

For 3.12, this fix applies on top of GH-125205, which is not yet merged. For 3.13, GH-125204 is also not yet merged.

encukou pushed a commit to miss-islington/cpython that referenced this pull request Nov 1, 2024
…ers (pythongh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.
encukou pushed a commit that referenced this pull request Nov 12, 2024
…gh-124865) (gh-125709) (GH-125204)

* gh-116510: Fix a Crash Due to Shared Immortal Interned Strings (gh-124865)

Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.

This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785.
(cherry picked from commit f2cb399)

Co-authored-by: Eric Snow <[email protected]>

* [3.13] gh-125286: Share the Main Refchain With Legacy Interpreters (gh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.

---------

Co-authored-by: Eric Snow <[email protected]>
encukou pushed a commit to miss-islington/cpython that referenced this pull request Nov 13, 2024
…ers (pythongh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.
ericsnowcurrently added a commit that referenced this pull request Dec 3, 2024
…gh-125205)

Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.

This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785 (i.e. backporting gh-125709 too).

(cherry picked from commit f2cb399, AKA gh-124865)

Co-authored-by: Eric Snow <[email protected]>
ebonnal pushed a commit to ebonnal/cpython that referenced this pull request Jan 12, 2025
…thongh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.

Co-authored-by: Petr Viktorin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants