[Release/8.0] Fix handling exceptions on shutdown #101915
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Another partial backport of #100293 to release/8.0
Customer Impact
A recent partial backport of a change to enable propagating exceptions on shutdown was missing two places in the code that were under a different condition than the global exception handling disabling that was removed. I must have made a mistake when testing that change (not setting watson registry settings properly or using a stale build). This change makes it work correctly.
During the runtime shutdown, we stop handling exceptions at all and we also prevent entering EE at few places. A customer has hit an issue with Watson dump being generated from a .NET app on every machine reboot due to that. That app was a networking app that was listening on a socket. Shutdown of the app resulted in a SocketException being thrown, but the exception handling was prevented and so the exception was reported as an unhandled exception which caused the dump to be taken.
Testing
CI testing, local testing of a case equivalent to what the customer had - a grpc running as a service. Without the fix the issue reported by the customer can be reproduced, with the fix it is gone.
Risk
Low - the change just enables exceptions handling during shutdown. So where previously the app would crash with an unhandled exception, it will now get proper handling of that exception upto the point the process is torn down by the OS.