Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_audit_subinterpreter crashes with tracerefs enabled #125286

Closed
nascheme opened this issue Oct 10, 2024 · 2 comments
Closed

test_audit_subinterpreter crashes with tracerefs enabled #125286

nascheme opened this issue Oct 10, 2024 · 2 comments
Labels
topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@nascheme
Copy link
Member

nascheme commented Oct 10, 2024

Crash report

What happened?

To run test: Programs/_testembed test_audit_subinterpreter.

The program crashes with an assert failure:

_testembed: ../Objects/object.c:196: _PyRefchain_Remove: Assertion `value == REFCHAIN_VALUE' failed.

Stack trace:

#5  0x00007fbd32045eb2 in __GI___assert_fail (
    assertion=assertion@entry=0x55f8c78ecded "value == REFCHAIN_VALUE", 
    file=file@entry=0x55f8c78ecd93 "../Objects/object.c", line=line@entry=196, 
    function=function@entry=0x55f8c78edba0 <__PRETTY_FUNCTION__.13> "_PyRefchain_Remove")
    at ./assert/assert.c:101
#6  0x000055f8c76c9999 in _PyRefchain_Remove (interp=<optimized out>, obj=obj@entry=0x7fbd3183d710)
    at ../Objects/object.c:196
#7  0x000055f8c76cb6c3 in _Py_ForgetReference (op=op@entry=0x7fbd3183d710) at ../Objects/object.c:2548
#8  0x000055f8c76ca91d in _Py_Dealloc (op=op@entry=0x7fbd3183d710) at ../Objects/object.c:2923
#9  0x000055f8c76b2ad4 in Py_DECREF (filename=filename@entry=0x55f8c78b4f72 "../Include/refcount.h", 
    lineno=lineno@entry=476, op=0x7fbd3183d710) at ../Include/refcount.h:367
#10 0x000055f8c76b2af3 in Py_XDECREF (op=<optimized out>) at ../Include/refcount.h:476
#11 0x000055f8c76b2c0f in dictkeys_decref (interp=interp@entry=0x7fbd31f6e020, 
    dk=dk@entry=0x55f8c9b29030, use_qsbr=use_qsbr@entry=false) at ../Objects/dictobject.c:459
#12 0x000055f8c76bca24 in dict_dealloc (self=0x7fbd31765070) at ../Objects/dictobject.c:3187
#13 0x000055f8c76ca923 in _Py_Dealloc (op=op@entry=0x7fbd31765070) at ../Objects/object.c:2925
#14 0x000055f8c76c67ab in Py_DECREF (
    filename=filename@entry=0x55f8c78ec613 "../Objects/moduleobject.c", lineno=lineno@entry=1118, 
    op=0x7fbd31765070) at ../Include/refcount.h:367
#15 0x000055f8c76c6bdb in module_clear (self=0x7fbd31764110) at ../Objects/moduleobject.c:1118
#16 0x000055f8c77e604a in delete_garbage (tstate=tstate@entry=0x7fbd31fa4d18, 
    gcstate=gcstate@entry=0x7fbd31f6fd08, collectable=collectable@entry=0x7ffd93cac500, 
    old=old@entry=0x7fbd31f6fd50) at ../Python/gc.c:1126
#17 0x000055f8c77e62f7 in gc_collect_region (tstate=tstate@entry=0x7fbd31fa4d18, 
    from=from@entry=0x7fbd31f6fd50, to=to@entry=0x7fbd31f6fd50, untrack=untrack@entry=3, 
    stats=stats@entry=0x7ffd93cac5e0) at ../Python/gc.c:1585
#18 0x000055f8c77e6790 in gc_collect_full (tstate=tstate@entry=0x7fbd31fa4d18, 
    stats=stats@entry=0x7ffd93cac5e0) at ../Python/gc.c:1498
#19 0x000055f8c77e7070 in _PyGC_Collect (tstate=0x7fbd31fa4d18, generation=generation@entry=2, 
    reason=reason@entry=_Py_GC_REASON_SHUTDOWN) at ../Python/gc.c:1862
#20 0x000055f8c77e70ce in _PyGC_CollectNoFail (tstate=tstate@entry=0x7fbd31fa4d18)
    at ../Python/gc.c:1903
#21 0x000055f8c7818408 in finalize_modules (tstate=tstate@entry=0x7fbd31fa4d18)
    at ../Python/pylifecycle.c:1716
#22 0x000055f8c781e9f4 in Py_EndInterpreter (tstate=tstate@entry=0x7fbd31fa4d18)
    at ../Python/pylifecycle.c:2409
#23 0x000055f8c781ec3a in finalize_subinterpreters () at ../Python/pylifecycle.c:2484
#24 0x000055f8c781edc3 in _Py_Finalize (runtime=runtime@entry=0x55f8c7b1e000 <_PyRuntime>)
    at ../Python/pylifecycle.c:2088
#25 0x000055f8c781ef34 in Py_Finalize () at ../Python/pylifecycle.c:2216
#26 0x000055f8c75e90ec in test_audit_subinterpreter () at ../Programs/_testembed.c:1431
#27 0x000055f8c75ece41 in main (argc=<optimized out>, argv=0x7ffd93cac888)
    at ../Programs/_testembed.c:2520

The object is a string that has been interned but not immortalized. Likely it is shared between sub-interpreters, using basic single-phase init and is therefore shared between multiple sub-interpreters.

(rr) p *op
$3 = {{ob_refcnt = 0, ob_refcnt_split = {0, 0}}, ob_type = 0x55f8c7b00c40 <PyUnicode_Type>}
(rr) p op
$4 = 'time_ns'

The value of _PyRefchain_IsTraced() is false since it is ref-traced in a different interpreter.

Quick and dirty fix for the crash:

--- a/Objects/object.c
+++ b/Objects/object.c
@@ -2537,6 +2537,13 @@ _Py_ForgetReference(PyObject *op)
 
     PyInterpreterState *interp = _PyInterpreterState_GET();
 
+    if (!_PyRefchain_IsTraced(interp, op) &&
+        PyUnicode_Check(op) &&
+        PyUnicode_CHECK_INTERNED(op) &&
+        interp != _PyInterpreterState_Main()) {
+        return;
+    }
+
 #ifdef SLOW_UNREF_CHECK
     if (!_PyRefchain_Get(interp, op)) {
         /* Not found */

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

No response

Linked PRs

@nascheme nascheme added the type-crash A hard crash of the interpreter, possibly with a core dump label Oct 10, 2024
@nascheme
Copy link
Member Author

This bug was introduced by gh-124865.

My analysis of the issue is that the tracerefs debugging logic gets confused when objects are shared between sub-interpreters. E.g. _PyRefchain_Trace() and _Py_ForgetReference() kind of implicitly assume that objects belong to a single interpreter (interp->object_state.refchain is what stores the trace and it's per interpreter). Objects can be shared when the reload_singlephase_extension() m_size == -1 case occurs. That makes the objects used by the extension become shared between all sub-interpreters importing that extension.

I'm not sure why interned strings are the only thing causing trace-refs failures. I would guess that other objects should also cause an issue since _Py_ForgetReference() will fail if they are traced in a different interpreter. It could be worth constructing a test case that triggers that problem (if possible).

@nascheme
Copy link
Member Author

Indeed, it does seem possible to crash with shared objects other than strings. You just have to initialize in the right way and then finalize in the right order.

https://github.com/nascheme/cpython/tree/test_single_phase_shared

nascheme added a commit to nascheme/cpython that referenced this issue Oct 11, 2024
nascheme added a commit to nascheme/cpython that referenced this issue Oct 11, 2024
nascheme added a commit to nascheme/cpython that referenced this issue Oct 11, 2024
ericsnowcurrently added a commit that referenced this issue Oct 23, 2024
They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.

Co-authored-by: Petr Viktorin <[email protected]>
encukou pushed a commit to miss-islington/cpython that referenced this issue Nov 1, 2024
…ers (pythongh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.
@github-project-automation github-project-automation bot moved this from Todo to Done in Subinterpreters Nov 11, 2024
encukou pushed a commit that referenced this issue Nov 12, 2024
…gh-124865) (gh-125709) (GH-125204)

* gh-116510: Fix a Crash Due to Shared Immortal Interned Strings (gh-124865)

Fix a crash caused by immortal interned strings being shared between
sub-interpreters that use basic single-phase init. In that case, the string
can be used by an interpreter that outlives the interpreter that created and
interned it. For interpreters that share obmalloc state, also share the
interned dict with the main interpreter.

This is an un-revert of gh-124646 that then addresses the Py_TRACE_REFS
failures identified by gh-124785.
(cherry picked from commit f2cb399)

Co-authored-by: Eric Snow <[email protected]>

* [3.13] gh-125286: Share the Main Refchain With Legacy Interpreters (gh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.

---------

Co-authored-by: Eric Snow <[email protected]>
encukou pushed a commit to miss-islington/cpython that referenced this issue Nov 13, 2024
…ers (pythongh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.
ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
…thongh-125709)

They used to be shared, before 3.12.  Returning to sharing them resolves a failure on Py_TRACE_REFS builds.

Co-authored-by: Petr Viktorin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Status: Done
Development

No branches or pull requests

3 participants