Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break queue segment chain if no enumeration is in progress #48296

Closed
wants to merge 1 commit into from

Conversation

BrzVlad
Copy link
Member

@BrzVlad BrzVlad commented Feb 15, 2021

Mono doesn't precisely scan stack which means that, if a QueueSegment ref gets stuck in a stack slot over a long period of time, no future queue segments will be collected and the queue will leak. We break the segment chain when dequeuing the last element from a segment, if no enumerations are in progress. This and the starting of the enumeration are protected by the same lock, which means we cannot miss the start of an enumeration while we are preparing to null the link.

Mono doesn't precisely scan stack which means that, if a QueueSegment ref gets stuck in a stack slot over a long period of time, no future queue segments will be collected and the queue will leak. We break the segment chain when dequeuing the last element from a segment, if no enumerations are in progress. This and the starting of the enumeration are protected by the same lock, which means we cannot miss the start of an enumeration while we are preparing to null the link.
@ghost
Copy link

ghost commented Feb 15, 2021

Tagging subscribers to this area: @eiriktsarpalis
See info in area-owners.md if you want to be subscribed.

Issue Details

Mono doesn't precisely scan stack which means that, if a QueueSegment ref gets stuck in a stack slot over a long period of time, no future queue segments will be collected and the queue will leak. We break the segment chain when dequeuing the last element from a segment, if no enumerations are in progress. This and the starting of the enumeration are protected by the same lock, which means we cannot miss the start of an enumeration while we are preparing to null the link.

Author: BrzVlad
Assignees: -
Labels:

area-System.Collections

Milestone: -

@BrzVlad
Copy link
Member Author

BrzVlad commented Feb 15, 2021

@stephentoub This is a simple and conservative fix for mono/mono#19665. Even though I didn't reproduce an unbounded memory growth on the testcase from that issue, this change still reduces managed heap size down to a third (with mono). I expect this change to be enough. Without it, ConcurrentQueue (and implicitly the threadpool) are potentially leaky on mono

// in case one segment remains pinned in memory (which can happen with mono). We can't race
// with the start of an enumeration because it needs to first take the cross segement lock.
if (pendingEnumerations == 0)
head._nextSegment = null;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh. This looks racy with other dequeuers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will break dequeues. A dequeuer could have already grabbed the head and needs to be able to fully traverse all linked segments from there.

@stephentoub stephentoub added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Feb 15, 2021
Base automatically changed from master to main March 1, 2021 09:07
@stephentoub
Copy link
Member

stephentoub commented Mar 8, 2021

eiriktsarpalis requested a review from stephentoub

The PR as it currently stands is broken:
#48296 (comment)
It's why at least some of the tests failed.

@BrzVlad
Copy link
Member Author

BrzVlad commented Mar 8, 2021

I think we would need some additional synchronization for mono if we want this not to leak.

@BrzVlad BrzVlad closed this Mar 8, 2021
@jkotas
Copy link
Member

jkotas commented Mar 8, 2021

How would synchronization help to make it less likely to leak with conservative stack scanning?

I would expect the usual fix to reduce severity of the leaks with conservative stack scanning is going to be setting the offending references to zero, ideally by the JIT.

@BrzVlad
Copy link
Member Author

BrzVlad commented Mar 9, 2021

@jkotas The potential leak in this case happens if a QueueSegment remains pinned for a long period of time, keeping alive all future segments that will belong to this queue. This PR was an easy attempt to break this queue, with little to no cost, but it races with other lock free dequeuers. I'm just saying that, if we want to safely break this list when the last element is dequeued, we would probably need more synchronisation to detect if another thread is currently dequeuing from the segment we are breaking off. Which I don't know how complicated is to do, but I definitely don't enjoy the idea and we would need to have our own concurrent queue for mono since it would slow down coreclr. You mean setting all stack slots that contain objrefs to null when the var dies, or at method end ? That would be expensive but sounds like a reasonable JIT configuration option for people that run into these types of rare issues.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 8, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Collections NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants