Fix infrequent/random crashes on Windows x64 due to use of GC forwarded objects. #34694
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
!! This PR is a copy of mono/mono#19475, please do not edit or review it in this repo !!
Do not automatically approve this PR:
* Consider how the changes affect configurations in this repo,
* Check effects on files that are not mirrored,
* Identify test cases that may be needed in this repo.
!! Merge the PR only after the original PR is merged !!
Hard to repro and very infrequent crash. Have been analyzing a couple of crash dumps from retail devices getting different crashes related to vtable "corruption" on Windows x64. After some deeper analysis it turns out the object instance has been forwarded by GC (object vtable pointers lowest bit set to 1), but object still holds tagged vtable. This will then cause misaligned reads, getting back random values and pointers from vtable on next object access.
After some further analyzing it turns out that LLVM codegen and some specific generic vt arrays lowering can cause optimized mem copies using XMM registers. I have also identified scenarios where vt copies gets lowered into a c-runtime memcpy that in turn uses XMM registers as an optimization. Since Windows x64 currently don't include XMM registers in context, any references in XMM registers will not be visible and pinned by GC, meaning that they will point to potentially
forwarded objects after completing GC, restarting threads, leading to these infrequent random crashes.
Fix includes xmm0-xmm15 into MonoContext on Windows x64, making sure GC will see all references that could be held in those registers, regardless if getting into those registers due to LLVM optimization or other native code, like memcpy.