-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very high CPU usage during NullReferenceException translation phase #61583
Comments
Your premise is misguided, because:
Given that: |
Yes. But there's a difference between "when it does happen, cpu cores are killed" and "when it does happen, the system still survives". The difference here is important for online services.
Agreed. But there's still a difference of what consequences a bug can cause. Current CoreCLR implementation of NullReferenceException can lead to huge CPU churn, while conceivable alternative implementation, while not changing the essence of the bug, does not.
extremely rare cases are what backend engineers deals with routinely, unfortunately. So anything that can help with the bottomline when extreme cases happen, is very meaningful. |
... yeah, and normally they deal with it by figuring out what is causing the exceptions in the first place, because that usually indicates a bug (or that the code should be switched to a
Not that improvements aren't made. But due to a number of design decisions exceptions in C# are relatively "expensive" - even if this one area is changed (and it yields a benefit), you may still have performance issues until you stop throwing so many exceptions.
CPU churn is the least of your worries. NPE (along with several other exceptions) are "halt-and-catch-fire" errors - there's no reasonable way to automatically recover, and depending on where it's being thrown from your application could be in a generally unsafe state. Not in the "The VM thinks memory is corrupt" sense, but in the "this object is missing a required field" sense, which could be worse. So stop throwing them. |
This should be fixed by #70428. The g_savedExceptionInfo and the spinlock is gone. |
Hi CoreCLR team,
We recently ran into an issue where try-catching NullReferenceException in a tight for-loop, on all CPU cores, led to severe contention in a spinlock usage in g_SavedExceptionInfo.Enter().
CLR uses windows vectored exception handling to translate access violation to ~0x0000000000000000 into managed exception (NullReferenceException) which then becomes catchable by managed code. During the translation phase, an auxiliary global object g_SavedExceptionInfo is used to pass the exception record from vectored exception handler to FixContextForFaultingExceptionFrame which eventually releases the lock.
When many threads are frequently triggering NullReferenceException, g_SavedExceptionInfo.Enter() will have severe contention, burning an extraordinary amount of CPU time.
Two questions here:
The text was updated successfully, but these errors were encountered: