-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage Collector changes #2434
Conversation
With this PR, the RefC garbage collector behaves mores like Chez's. Before, a pointer and a garbage collected pointer were two different objects, which makes the following code produce a segmentation fault ``` 1 do 2 ptr <- getExternalPointer 3 gcptr <- onCollectAny ptr free 4 putStrLn $ show ptr 5 pure() `` It crashes because after the third line the references to gcptr are removed, which invokes the freeing function. Line 4 therefore points to freed memory. Now, a each pointer has already has a reference to the garbage collecting function, and a call to onCollect/onCollectAny simply attaches the GC closure to it, but no new object is created. Thus, the freeing function will only be called once all references are removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of general comments:
- Could you please document what has changed in the
CHANGELOG.md
? This seems like it is a fairly big change and so should be documented. - Would it be possible to add a test case so that this doesn't accidentally get reintroduced?
Since @madman-bob wrote most of the RefC code, I'll let him comment on the code changes if he has the time. (If you're reading this bob, it's just the first commit that needs checking, the rest are linter fluff.)
Speaking of: the linter unhappiness is really odd! Seems it was perfectly happy with the pre-pr refc code, so I don't know why it's suddenly upset after you changed it... (the massive number of line changes initially had me scared to look through this 😅)
The linter was probably activated after the RefC contribution and it'll only complain about |
Probably, once the PR is ready, the first commit should be merged as a separate (not squashed) commit for future potential analysis of commits history on what was changed (e.g. if something goes wrong, e.g. on some weird architectures). |
The proposed change would cause an error when we try to create two I would argue that your snippet is incorrect, or at least undefined behaviour. As With regards to the linter - I recall that we disabled the C linter last year when I was doing some major RefC changes, for the same reason as here. If we are to keep the C linter, we should have a formatting commit for all |
I'm not sure I follow your argument, can you elaborate on that? With the PR, the behavior would not be undefined as the freeing function used by the last I argue instead that the current design was instead simply based on bad design decisions I made when I wrote the RefC backend, so this is a PR to account for that. The right approach to prevent malicious usage of the GC would be to change the signature of I'm happy to add test cases and copy the commit message to the Changelog. |
test |
I also took the liberty of reformatting all .c and .h files in the |
In the following example, doubleFree = do
ptr <- getExternalPointer
gcptr1 <- onCollectAny ptr free1
gcptr2 <- onCollectAny ptr free2
putStrLn $ show ptr I would expect this example to be valid, and both It is correct that the current implementation causes your example to segfault. That is a flaw in your example, rather than the implementation of directFree = do
ptr <- getExternalPointer
free ptr
putStrLn $ show ptr The issue is the abstraction of "pointer"s, rather than the implementation of them. If you are using pointers, you acknowledge the possibility of segfaults. I would expect a corrected version of your example to look something like this: useGCPtr = do
gcptr <- onCollectAny !getExternalPointer free
putStrLn $ show gcptr.ptr |
Thank you for confirming that the current version would lead to a segmentation fault. Together with it behaving differently than the other backends shows that this PR is a step forward. To bring it two steps forward and have the garbage collector invoke both functions, free1 and free2 is a different discussion. I didn't understand your On th eother hand, here is my actual use case where the necessity for this PR became clear: There are C libraries out there, that provide data storages in various nested structs, along with their own alloc and free functions (I'm looking at you, protobuf). The user is expected to haggle their way through the structs to CRUD, but alloc and free is managed by the library for the top level struct only. Thus, an Idris system would have a garbage collected pointer to the top struct, whose freeing function is the provided protobuf_object_free(..) function. All underlying structs would need a raw pointer (non-GC) only. Given this situation, how does |
The purpose of the Further, before this PR, the As for your use case, the Your concern about immediately collected |
I tried may ways using a Anyways, I won't pursue this PR and the garbage collector change any further. I see that my changes break the behavior of multiple freeing functions, but I need the new GC behavior (as well as further RefC backend changes I implemented), so it seems clear to me that best way forward is a separate and separately maintained backend. |
It should be possible to get such a |
I don't think it is. As I mentioned, there are further changes I implemented that I care about but might be critically received. As an example, I care about creating 2 more backends for building shared and static libraries using the recently introduced %export tag. A potential problem is the global IORef variable. Assume several threads of an external program all want to invoke their own Idris code (This extra-Idris multithreading doesn't affect the GC). So I put a pthread mutex around that, which restricts the new backends to Linux/MaxOS system. The new issue #2455 confirms my assumption that it's better to have something separately, so I can fix the memory problems without breaking current functionality for other people. |
I don't understand this statement. Isn't the space leak reported in #2455 a bug that we would |
Yes, and I'm happy to contribute. But I don't know what would be the best way to address it, without breaking functionality for some. So I let others be the judge of that. I want to go ahead to get a working backend, even if it limits the functionality for others. |
In this case, maybe the interface to Deallocator : Maybe Type -> Type
Deallocator Nothing = (AnyPtr -> IO ())
Deallocator (Just t) = (Ptr t -> IO ())
onCollect : HasIO io => Ptr t -> (List (Deallocator (Just t)) -> List (Deallocator (Just t))) -> io () So that we don't have duplicated |
Any progress on that? |
I think the conclusion last time was that the interface to GCPtrs should change slightly to reflect the behavior and invariants better, and more uniformly across the backends. I don't have a good feel as to whether that's best done in thus PR or in a new PR that starts from scratch, changes the interface, and implements the new behaviour uniformly. |
let's close this for now and reopen once the interface has been clarified. @ohad @vfrinken or @madman-bob can one of you write up an issue to keep track of this? |
With this PR, the RefC garbage collector behaves mores like Chez's.
Before, a pointer and a garbage collected pointer were two different
objects, which makes the following code produce a segmentation fault