-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor handling of non-SOH generations #1688
Conversation
I am no longer able to add commits to the old PR. After the runtime repo went public, my clone lost the upstream relationship (this happens on GitHub when parent fork changes visibility). The PR form the old clone was working for a while, but not any more. It looks like the time to make a new PR. This is continuation of #235 |
Meant to close the other one #WontFix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here are the comments so far from where I stopped on #235. more to go; will pick up tomorrow.
src/coreclr/src/gc/gc.cpp
Outdated
size_t eph_gen_starts = (Align (min_obj_size)) * (max_generation + 1); | ||
|
||
// See comment in can_expand_into_p why we need this size. | ||
size_t eph_gen_starts = gc_heap::eph_gen_starts_size + Align (min_obj_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - no need for gc_heap:: here #Resolved
src/coreclr/src/gc/gc.cpp
Outdated
@@ -20632,9 +20576,8 @@ size_t gc_heap::get_total_committed_size() | |||
return total_committed; | |||
} | |||
|
|||
size_t gc_heap::committed_size (bool loh_p, size_t* allocated) | |||
size_t gc_heap::committed_size (int gen_number, size_t* allocated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are changing the meaning of this method which appears to return the committed size for a given generation but this does not work for gen0/1. if you want to make gen0/1 work you'd have to start counting from their gen start. but since we currently just use this for loh (I meant this to be for either the total soh or loh thus it took loh_p as an arg), I would suggest to just rename this to uoh_committed_size. #Resolved
src/coreclr/src/gc/gc.cpp
Outdated
loh_committed, loh_allocated, | ||
dd_fragmentation (dynamic_data_of (max_generation + 1)), | ||
size_t uoh_allocated = 0; | ||
size_t uoh_committed = committed_size (gen_number, &uoh_allocated); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this committed_size only gets the committed size for the given generation though so this should stay loh_committed, not uoh_committed (which would mean for all uoh generations). #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not necessarily loh
generation either. I will just call it allocated
and committed
. They are just numbers reported by the dprintf
In reply to: 367271669 [](ancestors = 367271669)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure they are not necessarily loh gen, but the point is for any uoh gen you will be returning the correct # for committed/allocated and that's not the case for gen0/1. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is now:
else // not a SOH generation
{
uint32_t memory_load = 0;
uint64_t available_physical = 0;
get_memory_info (&memory_load, &available_physical);
#ifdef TRACE_GC
if (heap_hard_limit)
{
size_t allocated = 0;
size_t committed = uoh_committed_size (gen_number, &allocated);
dprintf (1, ("GC#%Id h%d, GMI: UOH budget, UOH commit %Id (obj %Id, frag %Id), total commit: %Id (recorded: %Id)",
(size_t)settings.gc_index, heap_number,
committed, allocated,
dd_fragmentation (dynamic_data_of (gen_number)),
get_total_committed_size(), (current_total_committed - current_total_committed_bookkeeping)));
}
#endif //TRACE_GC
. . .
what should it be?
In reply to: 368124892 [](ancestors = 368124892)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean? you said "this is not necessarily loh generation either.", and I said that's true but it will return correct info for other uoh gen anyway. the updated code looks fine to me. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant the loh
should not be a part of the name either.
In reply to: 368133805 [](ancestors = 368133805)
src/coreclr/src/gc/gc.cpp
Outdated
gen_data->free_list_space_after = generation_free_list_space (large_object_generation); | ||
gen_data->free_obj_space_after = generation_free_obj_space (large_object_generation); | ||
gen_data->npinned_surv = out; | ||
for (int i = gen_number + 1; i < total_generation_count; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
for (int i = (gen_number + 1); i < total_generation_count; i++)
#Resolved
src/coreclr/src/gc/gc.cpp
Outdated
for (int i = max_generation; i < total_generation_count; i++) | ||
{ | ||
committed_mem += heap_segment_committed (seg) - heap_segment_mem (seg); | ||
seg = heap_segment_next (seg); | ||
} | ||
seg = generation_start_segment (generation_of (max_generation + 1)); | ||
while (seg) | ||
{ | ||
committed_mem += heap_segment_committed (seg) - heap_segment_mem (seg); | ||
seg = heap_segment_next (seg); | ||
heap_segment* seg = generation_start_segment (generation_of (i)); | ||
while (seg) | ||
{ | ||
committed_mem += heap_segment_committed (seg) - heap_segment_mem (seg); | ||
seg = heap_segment_next (seg); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could actually just be replaced by a simple committed_size() call... my bad that I didn't notice this before #Resolved
src/coreclr/src/gc/gc.cpp
Outdated
@@ -33940,7 +33837,7 @@ void gc_heap::process_n_background_segments (heap_segment* seg, | |||
heap_segment* prev_seg, | |||
generation* gen) | |||
{ | |||
assert (gen != large_object_generation); | |||
assert (gen->gen_num <= max_generation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with just deleting the process_n_background_segments
method as it's unused #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here are the comments on the rest of the gc.cpp changes... tomorrow I will look at the other files.
src/coreclr/src/gc/gc.cpp
Outdated
next_sweep_obj = o; | ||
|
||
allow_fgc(); | ||
uint8_t* end = background_next_end (seg, i > max_generation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
uint8_t* end = background_next_end (seg, (i > max_generation));
#Resolved
allow_fgc(); | ||
current_num_objs = 0; | ||
} | ||
} while (o < end && background_object_marked(o, TRUE)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
} while ((o < end) && background_object_marked(o, TRUE));
#Resolved
current_num_objs = 0; | ||
} | ||
} | ||
} // while (o < end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the comment here some accidental leftover from some previous changes? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactoring these loops required some extra care. I think I have added a comment to the closing brace to not lose track what it matches with while refactoring and forgot to remove.
It is no longer needed.
In reply to: 367733008 [](ancestors = 367733008)
current_num_objs = 0; | ||
} | ||
enter_spin_lock (&more_space_lock_uoh); | ||
add_saved_spinlock_info (true, me_acquire, mt_bgc_loh_sweep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should rename mt_bgc_loh_sweep to mt_bgc_uoh_sweep then. #Resolved
{ | ||
break; | ||
} | ||
concurrent_print_time_delta ("Swe LOH took msl"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit -
concurrent_print_time_delta ("Swe UOH took msl");
#Resolved
src/coreclr/src/gc/gc.cpp
Outdated
@@ -34754,13 +34606,14 @@ void gc_heap::relocate_in_large_objects () | |||
} | |||
} | |||
|
|||
void gc_heap::mark_through_cards_for_large_objects (card_fn fn, | |||
void gc_heap::mark_through_cards_for_uoh_objects (card_fn fn, | |||
int gen_num, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - spaces need to be adjusted here. the coding convention says that the parameters have the same indentation. #Resolved
src/coreclr/src/gc/gc.cpp
Outdated
@@ -35843,7 +35645,7 @@ gc_heap::verify_free_lists () | |||
(size_t)free_list)); | |||
FATAL_GC_ERROR(); | |||
} | |||
if ((gen_num != max_generation+1)&&(object_gennum (free_list)!= gen_num)) | |||
if ((gen_num <= max_generation)&&(object_gennum (free_list)!= gen_num)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - would be good to take the opportunity to fix the spacing here -
if ((gen_num <= max_generation) && (object_gennum (free_list)!= gen_num))
#Resolved
src/coreclr/src/gc/gc.cpp
Outdated
@@ -35884,7 +35686,9 @@ gc_heap::verify_heap (BOOL begin_gc_p) | |||
BOOL large_brick_p = TRUE; | |||
size_t curr_brick = 0; | |||
size_t prev_brick = (size_t)-1; | |||
int curr_gen_num = max_generation+1; | |||
|
|||
// go through all generations starting with loh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be better to start with (total_generation_count - 1) here - inevitably we'd want to verify all generations. #Resolved
src/coreclr/src/gc/gc.cpp
Outdated
0); | ||
|
||
if (should_collect_optimized (dd1, low_memory_p)) | ||
if (!should_collect && should_check_uoh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you still need to call should_collect_optimized on max_generation always; you just also need to check uoh if should_check_uoh is true (and should_collect_optimized on max_generation is false).
also if it's changed to uoh it should really check all uoh generations #Resolved
src/coreclr/src/gc/gc.cpp
Outdated
} | ||
|
||
//discount the fragmentation | ||
generation* gen2 = pGenGCHeap->generation_of (i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - uoh_gen seems like a better name than gen2 (I'm aware that it says seg2 above but gen2 means something specific) #Resolved
this now wouldn't be correct since this is executed when soh is not true. strictly speaking this wasn't exactly working before in the dependent handle scenario but at least when there isn't, this works by design (when we added DH it didn't use concurrent_print_time_delta). you could put this line under the if (small_object_segments) check before so at least it isn't worse than before. #Resolved Refers to: src/coreclr/src/gc/gc.cpp:20318 in 2d3c483. [](commit_id = 2d3c483, deletion_comment = False) |
same here - since this is checking for allocations would be a bit better to do Refers to: src/coreclr/src/gc/gc.cpp:12345 in 2d3c483. [](commit_id = 2d3c483, deletion_comment = False) |
seems better to replace places that say (max_generation + 1) with uoh_start_generation in the allocation path since UOH specially says it's user allocated #Resolved Refers to: src/coreclr/src/gc/gc.cpp:12231 in 2d3c483. [](commit_id = 2d3c483, deletion_comment = False) |
you could do Refers to: src/coreclr/src/gc/gc.cpp:8363 in 2d3c483. [](commit_id = 2d3c483, deletion_comment = False) |
this shouldn't be min_uoh_segment_size because POH has much smaller seg size. we can have a method that returns the min seg size for each generation. #Resolved Refers to: src/coreclr/src/gc/gc.cpp:13450 in 2d3c483. [](commit_id = 2d3c483, deletion_comment = False) |
I think it will be easier to address once POH is introduced. We may have to tune not just the segment size. ( https://github.com/dotnet/coreclr/issues/27696 ) In reply to: 575814638 [](ancestors = 575814638) Refers to: src/coreclr/src/gc/gc.cpp:13450 in 2d3c483. [](commit_id = 2d3c483, deletion_comment = False) |
I have looked through logs for the previous PR and moved unaddressed comments here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I've gone through all changes for this PR
@@ -701,7 +701,7 @@ ClrDataAccess::GetThreadAllocData(CLRDATA_ADDRESS addr, struct DacpAllocData *da | |||
Thread* thread = PTR_Thread(TO_TADDR(addr)); | |||
|
|||
data->allocBytes = TO_CDADDR(thread->m_alloc_context.alloc_bytes); | |||
data->allocBytesLoh = TO_CDADDR(thread->m_alloc_context.alloc_bytes_loh); | |||
data->allocBytesLoh = TO_CDADDR(thread->m_alloc_context.alloc_bytes_uoh); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to rename this allocBytesLoh field? this GetThreadAllocData is only used by sos I think (not by clrmd?)? @leculver @mikem8361 could you please confirm? for context we are renaming the alloc_bytes_loh field on the alloc_context to alloc_bytes_uoh because we are introducing a concept of uoh which includes loh and additional generations. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like SOS doesn't use this API. Not sure about CLRMD. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am suggesting changing this only when clients can use it. No benefit from changing this now, for sure, since there is still only loh in this change, but we may break something.
In reply to: 368189692 [](ancestors = 368189692)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a tracking bug to revisit this - https://github.com/dotnet/coreclr/issues/27692
In reply to: 368208096 [](ancestors = 368208096,368189692)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ClrMD we reimplement all of this interface in C#. Changing names won't affect us at all, but we require these structures do not change in layout or meaning for backwards compat.
When real changes go in that affect the underlying shape of the GC heap and/or requires changing the shape or usage of DAC data structures it's best to implement a new interface (ISOSDacX or similar) that we can query for to tell us there's a new style of heap involved.
src/coreclr/src/gc/gcee.cpp
Outdated
allocation_3 += | ||
dd_desired_allocation (hp->dynamic_data_of (max_generation+1))- | ||
dd_new_allocation (hp->dynamic_data_of (max_generation+1)); | ||
dd_desired_allocation (hp->dynamic_data_of (loh_generation))- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems odd that you have different code for WKS and SVR.. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
odd. Also it seems that allocation_0
and allocation_3
are computed, but not used. ..
In reply to: 368189884 [](ancestors = 368189884)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this seems to be the last actionable comment on this PR that I am aware of.
In reply to: 368247232 [](ancestors = 368247232,368208845,368189884)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, this was because the code for perf counters was removed and allocation_0/3 was used by perf counters but was mistakenly not removed.
src/coreclr/src/gc/gcinterface.dac.h
Outdated
@@ -17,6 +17,8 @@ | |||
#define MAX_EXPAND_MECHANISMS_COUNT 6 | |||
#define MAX_GC_MECHANISM_BITS_COUNT 2 | |||
#define MAX_GLOBAL_GC_MECHANISMS_COUNT 6 | |||
|
|||
// for backward compatibility the number of generations for DAC must stay 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the case - you can make it work, it's just work that we need to do. it would be very odd if we just forever only process 4 generations. please talk to the diagnostics folks about this. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the SOS DAC api (in dacprivate.h starting at line 712) assume that there are only 4 generations. The DacpGcHeapDetails request and DacpGenerationAllocData assume 4 generations. SOS and probably CLRMD use this API so we need to come up with a backward compatible way to support this. Like a new Request (DacpGcHeapDetails2 ?) that the 5.0 DAC supports. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that this is impossible, but it is not something to be changed lightly. It is a breaking change.
It is also not clear whether if it is better to introduce a new generation for POH or other generations that we may have after, or just fold everything UOH together.
There is a work item for this - https://github.com/dotnet/coreclr/issues/27693
In reply to: 368190184 [](ancestors = 368190184)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I certainly wasn't suggesting changing this lightly. I was suggesting we should not have an incorrect comment. we could say that it will require work to change this. feel free to open an issue for changing it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the comment. It does not make a lot of sense in this change, and related work is tracked in a workitem.
if you have not run stress and perf pass on this change I certainly would suggest to do so (I would guess you have not run perf or it should have shown regression due to the changes in BGC sweep). most of the changes are very benign but at this scale we could be making accidental serious mistakes. |
Yes, I will definitely run stress/perf once there are no more changes. There could be bugs. |
Just catching up on this. I read through the changes but I'm not clear on whether this change would require updates to SOS and ClrMD for heap enumeration. Obviously changes will need to happen once this is true:
But does this particular refactor change the layout of the GC heap at all? |
great, thanks @VSadov! |
I did run stress a while ago, but since then there were many changes, so that does not count. |
@leculver - this is just a refactoring change. No new functionality. The questions about SOS/DAC/ClrMD are mostly forward-looking. |
Thanks, please keep me cc'ed on future changes of this nature! |
7994fe0
to
bc07b39
Compare
@@ -34117,7 +34120,7 @@ void gc_heap::background_sweep() | |||
|
|||
if (i == max_generation) | |||
{ | |||
start_seg = ephemeral_heap_segment; | |||
start_seg = eph_seg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is already saved as saved_sweep_ephemeral_seg
in background_ephemeral_sweep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not notice that we are already saving that. Will use that one then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -3170,7 +3167,7 @@ void gc_heap::fire_per_heap_hist_event (gc_history_per_heap* current_gc_data_per | |||
current_gc_data_per_heap->mechanisms[gc_heap_expand], | |||
current_gc_data_per_heap->heap_index, | |||
(void *)(current_gc_data_per_heap->extra_gen0_committed), | |||
(max_generation + 2), | |||
total_generation_count - 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Maoni0 - looks like it is this change that interferes with event format. I did -1
to report the same number of heaps when POH was added and did not readjust this back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By looking at how version of this events is calculated/validated, this should be just total_generation_count
. PerfView seems to permit extra generations here - for future versions. Going below 4 is unexpected though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, okie. can you please rerun the perf test and make sure you are now seeing the correct data?
it's strange why the 0gb is the only scenario that's exhibiting the behavior. and also it seems like the LOH data should be the one that gets messed up.
PR feedback
more PR feedback
fixed |
also rebased due to merge conflicts |
Thanks!! |
Refactor handling of non-SOH generations (dotnet#1688)
A recent refactoring PR dotnet#1688 has regressed GC processing of mark overflow. If GC heap segments in the segment linked list are not ordered in an ascending order by their addresses, the mark overflow processing misses segments on lower addresses if they follow segments on higher addresses. This leads to some objects that are alive to not to be reported and freed. This change fixes the problem by making sure the segment order doesn't matter.
A recent refactoring PR #1688 has regressed GC processing of mark overflow. If GC heap segments in the segment linked list are not ordered in an ascending order by their addresses, the mark overflow processing misses segments on lower addresses if they follow segments on higher addresses. This leads to some objects that are alive to not to be reported and freed. This change fixes the problem by making sure the segment order doesn't matter.
Soon LOH will be not the only non-SOH generation. That will break a few existing assumptions and common patterns.
This change makes the code to expect that in addition to LOH, there could be other non-SOH generations without introducing one (yet).
In particular:
loh
suffix for members applicable to non-SOH scenario in general. We now use more generic suffixaoh
instead.loh_generation
unless it is something specific to large object heap.For example iterating non-SOH generations is done like
We used to go through segments in the outer loop, and switch to LOH upon reaching the end of the list and breaking out when reaching the end once more. With more non-SOH generations this pattern is not very convenient.
Now we iterate generations in the outer loop and segments in the inner.
#3
index, etc...