-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
py::keep_alive: nurse destructor can run before patient's #856
Comments
Or maybe it would work for the callback function to call the C++ destructor if it hasn't already been called, and deal with that gracefully when the object is cleared - but I don't know nearly enough about the internals to know if that's sane or not. |
I suspect what's happening here is that both A and B are being destroyed at the same time ("same" being within a single garbage collection). Essentially keep_alive is a bit weaker than you're expecting: it really only guarantees that A won't be admitted for garbage collection before B, but doesn't guarantee that they can't be admitted together (and as far as I know Python doesn't provide any guarantees on the order of destruction of objects within a garbage collection pass). If they are admitted together, you can get out-of-order destruction, as you're seeing. I'm not sure that there's a simple workaround that could be done to deal with this on the pybind11 side (especially since neither object actually has to be pybind-managed). You ought to be able to work around it by storing by storing a |
Not exactly. B isn't identified as garbage in the garbage collection pass, because its reference is held by pybind11. It's really about the order of clearing A vs calling the weakref callbacks on A's weakref, which isn't guaranteed. If A is cleared first, everything goes fine. If the weakref callback is called first, then it drops the last reference to B, immediately causing B to be cleared.
I hadn't considered that the objects aren't necessarily pybind-managed, but on the other hand if the nurse isn't a C++ object then I don't think this is an issue. I have some ideas for fixing it if it is pybind-managed, which I can experiment with. Is there a way to determine whether a given py::handle is pybind-managed?
I've already done exactly that, but I'm in the fortunate position of being able to modify the C++ API I'm wrapping. |
from keepalive import A, B
import gc
gc.collect()
gc.set_debug(gc.DEBUG_STATS | gc.DEBUG_COLLECTABLE)
lst = [A(B())]
lst.append(lst)
del lst
gc.collect()
gc.set_debug(0) output:
A fix would be nice, but I still suspect it will end up getting rather complicated.
|
Yes. I think we're in agreement and just thinking about the same thing in different ways.
Thanks, I'll see what I can cook up and how hairy it gets. Failing that, we should at least put a warning in the user manual. |
Agreed. I'll wait for you to see if you can come up with something else, but if not, a note in the docs is definitely worthwhile. |
This is a first attempt to address pybind#856. There is still a bit of work to do (e.g., some tests, and double-checking that I'm not leaking references). At this point I'm looking for feedback on whether the idea is sound and would be accepted. Instead of the weakref trick, every pybind-managed object holds a (Python) list of references, which gets cleared in `clear_instance`. The list is only constructed the first time it is needed. The one downside I can see is that every pybind object will get bigger by `sizeof(PyObject *)`, even if no keep_alive is used. On the other hand, it should significantly reduce the overhead of using keep_alive, since there is no need to construct a weakref object and a callback object. One thing I'm not sure about is whether the call to `Py_CLEAR(instance->patients);` belongs inside or outside the `has_value` test. At the moment it's inside, but that's cargo culting rather than from an understanding of the conditions under which an instance might not have a value.
This is a first attempt to address pybind#856. There is still a bit of work to do (e.g., some tests, and double-checking that I'm not leaking references). At this point I'm looking for feedback on whether the idea is sound and would be accepted. Instead of the weakref trick, every pybind-managed object holds a (Python) list of references, which gets cleared in `clear_instance`. The list is only constructed the first time it is needed. The one downside I can see is that every pybind object will get bigger by `sizeof(PyObject *)`, even if no keep_alive is used. On the other hand, it should significantly reduce the overhead of using keep_alive, since there is no need to construct a weakref object and a callback object. One thing I'm not sure about is whether the call to `Py_CLEAR(instance->patients);` belongs inside or outside the `has_value` test. At the moment it's inside, but that's cargo culting rather than from an understanding of the conditions under which an instance might not have a value.
This fixes pybind#856. Instead of the weakref trick, the internals structure holds an unordered_map from PyObject* to a vector of references. To avoid the cost of the unordered_map lookup for objects that don't have any keep_alive patients, a flag is added to each instance to indicate whether there is anything to do.
This fixes pybind#856. Instead of the weakref trick, the internals structure holds an unordered_map from PyObject* to a vector of references. To avoid the cost of the unordered_map lookup for objects that don't have any keep_alive patients, a flag is added to each instance to indicate whether there is anything to do.
Issue description
The py::keep_alive weakref trick doesn't always work. In some cases (at least in Python 2.7, haven't tried 3), the weakref callback can run before the referenced object is cleared. This happens in the cyclic garbage collector: first all weakref callbacks are handled (
handle_weakrefs
function), then the garbage is cleared. If the nurse is being destroyed on this path, then it can lead to the patient destructor running before the nurse's (leading to undefined behaviour if the destructor of the nurse references the patient). Note that the nurse doesn't itself have to be part of a reference cycle, it just has to be reachable from one. However, it needs to havePy_TPFLAGS_HAVE_GC
. In the example code I've ensured that by specifyingpy::dynamic_attr
, but I think when I originally encountered the problem it was set due to having a declared base.I've had similar issues with Boost.Python before. I suspect the weakref approach is fundamentally untenable; it may be necessary to store a list of handles (with strong references) in the PyObject and decref them when the object is cleared.
Reproducible example code
keepalive.cpp:
test.py:
Output is
Expected output is the other way around.
The text was updated successfully, but these errors were encountered: