An error occurred while writing to logger(s) - Safe handle has been closed #3269

p10tyr · 2022-05-11T11:51:37Z

Bug Report

List of all OpenTelemetry NuGet
packages and version that you are
using (e.g. OpenTelemetry 1.0.2):

-package id="OpenTelemetry" version="1.2.0" targetFramework="net472" />
-package id="OpenTelemetry.Api" version="1.2.0" targetFramework="net472" />
-package id="OpenTelemetry.Contrib.Extensions.AWSXRay" version="1.0.1" targetFramework="net472" />
-package id="OpenTelemetry.Contrib.Instrumentation.AWS" version="1.0.1" targetFramework="net472" />
-package id="OpenTelemetry.Exporter.OpenTelemetryProtocol" version="1.2.0" targetFramework="net472" />
-package id="OpenTelemetry.Exporter.OpenTelemetryProtocol.Logs" version="1.0.0-rc9.2" targetFramework="net472" />
-package id="OpenTelemetry.Instrumentation.AspNet" version="1.0.0-rc9.2" targetFramework="net472" />
-package id="OpenTelemetry.Instrumentation.AspNet.TelemetryHttpModule" version="1.0.0-rc9.2" targetFramework="net472" />
-package id="OpenTelemetry.Instrumentation.Http" version="1.0.0-rc9.2" targetFramework="net472" />
-package id="OpenTelemetry.Instrumentation.SqlClient" version="1.0.0-rc9.2" targetFramework="net472" />

Runtime version (e.g. net462, net48, netcoreapp3.1, net6.0 etc. You can
find this information from the *.csproj file):

NET 4.7.2 - WebAPI - IIS

Symptom

A clear and concise description of what the bug is.

Random requests dying with 500 error and a lot of errors logged with
Funnily enough the errors are coming from the OTLP Logger but verfied in our other APM

An error occurred while writing to logger(s) - Safe handle has been closed

System.AggregateException: An error occurred while writing to logger(s). 
---> System.ObjectDisposedException: Safe handle has been closed\r\n   
at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)\r\n   
at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)\r\n   
at Microsoft.Win32.Win32Native.SetEvent(SafeWaitHandle handle)\r\n   
at System.Threading.EventWaitHandle.Set()\r\n   
at OpenTelemetry.BatchExportProcessor`1.OnExport(T data)\r\n   
at OpenTelemetry.BatchLogRecordExportProcessor.OnEnd(LogRecord data)\r\n   
at OpenTelemetry.Logs.OpenTelemetryLogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)\r\n  
 at Microsoft.Extensions.Logging.Logger.<Log>g__LoggerLog|12_0[TState](LogLevel logLevel, EventId eventId, ILogger logger, Exception exception, Func`3 formatter, List`1& exceptions, TState& state)\r\n   --- End of inner exception stack trace ---\r\n  
 at Microsoft.Extensions.Logging.Logger.ThrowLoggingError(List`1 e

What is the expected behavior?

Nothing.. just works

What is the actual behavior?

ERROR 500 on random requests

Reproduce

Not possible? Very flakey issue
We are using Tracing Exporter BatchExporter for a few months and that is OK
Using the Logs BatchExporter, clearly can see an issue there

Not sure if its a STATE that we are passing in?

NULL state?
maybe state is disposed?
maybe logger is disposed inside a using
maybe state is not in a using but inside another using that got disposed after getting to the logger??

We will close this issue if:

The repro project you share with us is complex. We can't investigate custom
projects, so don't point us to such, please.
If we can not reproduce the behavior you're reporting.

Additional Context

Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

cijothomas · 2022-05-11T12:20:15Z

#2905 same issue I think.

p10tyr · 2022-05-11T15:03:53Z

OK is a workaround for me to change this option to false.
options.IncludeScopes = false;

I do not really need scopes. I understand this package is not stable so I was very careful to not release to production this and caught it early days.

Worth maybe just updating the document on the seperate package page to say DO NOT USE SCOPES 😆

--edit
This document page
https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/docs/logs/getting-started/README.md

cijothomas · 2022-05-11T15:15:58Z

The linked issue is not caused by scopes. We are not aware of any issues with scopes.
Also OTLP LogExporter just recently added scope support, and is not yet released . https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol/CHANGELOG.md#unreleased

So I can't saw how your issue was solved by disabling scopes. Will investigate more... If you can get a repro - please do share it, as that'd super helpful.

p10tyr · 2022-05-11T15:49:39Z

I am trying to see if I can reproduce it locally. It usually only happens on our test servers.

Also the area that the stack trace reports for me as being disposed does not make sense because there is no usings or context of state there ... apart the use of Scope. I think this somehow is related. It back out again on our test servers I hope to not see any more errors for now.

p10tyr · 2022-05-12T08:27:17Z

PS-

I did not have ParseStateValues = true; I had it turned off the whole time (default)

So this safe handles does not relate to that? Just my opinion I don't know if internally it still tried to access the state even though its set to false.

I have checked with it on and off and can see the difference in Elastics with the state KV being save (true) and not being saved (false)

I have gone over all out logs and NO local machine ever throws this error. It only looks like our remote machines have some kind of issue here. I don't know if its related to infrastructure/ networking or something specific on the hosted machines.

We do get a lot of a Task has been cancelled because of requests ending sooner than we expected due to front end stuff

reyang · 2022-05-18T17:09:12Z

From the callstack, the LogProcessor is already disposed (either by the user code if the user is controlling the lifecycle of LoggerProvider, or by the hosting environment - e.g. ASP.NET or ASP.NET Core).

I can imagine there are cases where the user has less or even no control over the lifecycle of LoggerProvider, and it'll be better if we can avoid punishing the app by throwing exception from LogProcessor/SDK.

System.AggregateException: An error occurred while writing to logger(s). 
---> System.ObjectDisposedException: Safe handle has been closed\r\n   
at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)\r\n   
at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)\r\n   
at Microsoft.Win32.Win32Native.SetEvent(SafeWaitHandle handle)\r\n   
at System.Threading.EventWaitHandle.Set()\r\n   
at OpenTelemetry.BatchExportProcessor`1.OnExport(T data)\r\n   
at OpenTelemetry.BatchLogRecordExportProcessor.OnEnd(LogRecord data)\r\n   
at OpenTelemetry.Logs.OpenTelemetryLogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)\r\n  
 at Microsoft.Extensions.Logging.Logger.<Log>g__LoggerLog|12_0[TState](LogLevel logLevel, EventId eventId, ILogger logger, Exception exception, Func`3 formatter, List`1& exceptions, TState& state)\r\n   --- End of inner exception stack trace ---\r\n  
 at Microsoft.Extensions.Logging.Logger.ThrowLoggingError(List`1 e

p10tyr · 2022-05-18T20:08:17Z

We are not doing anything explicitly with the LoggerProviders. In this case we are using NET472 API project with logger factory created and attaching open telemetry to that as per documentation.

As mentioned. We get a lot of requests cancelled but the problem here is that random request now just get completely bricked because this error is coming up.
I can not find any kind of pattern as to why and it is quite possible that our CaslteDI, maybe some crazy custom filters or something is disposing of something at the wrong time.. but until then we can not use OTLP Logging and ASPNETCORE upgrade is already delayed 2 years 😢

OTLP Tracing works fine for us.

reyang · 2022-05-19T00:32:42Z

I feel this is something that OpenTelemetry .NET should provide a workaround. I've seen many places where ILogger gets used after the LoggerProvider is disposed.

p10tyr · 2022-05-19T12:17:23Z

I wonder if anybody else is having this issue?
Is it possible not so many people using this yet because its not officially in the main package, so there's no user testing?

It would be amazing to get a fix out for this as I am waiting to turn this on in production. Unfortunately I cant use myget feeds so it would either need me to compile it or have a fix in *beta

cijothomas · 2022-05-19T16:41:49Z

#2905 same issue I think.

^^ this was incorrect. This is a separate issue!

reyang · 2022-05-19T17:46:02Z

As mentioned. We get a lot of requests cancelled but the problem here is that random request now just get completely bricked because this error is coming up. I can not find any kind of pattern as to why and it is quite possible that our CaslteDI, maybe some crazy custom filters or something is disposing of something at the wrong time..

@ppumkin #3291 should help to mitigate the issue you're seeing. Meanwhile I hope to get to the bottom of this so we understand whether it was due to the misuse of the DI, or a bug in the underlying framework. Would you be willing to collect more information?

Here is my thinking - add a line of code to

opentelemetry-dotnet/src/OpenTelemetry/BatchExportProcessor.cs

Line 211 in a3e619f

protected override void Dispose(bool disposing)

, record the full callstack when the processor is being disposed, when we see the ObjectDisposedException, take a dump and see which code path disposed the processor. In this way we can understand if the race condition was a framework flaw or user mistake.

p10tyr · 2022-05-20T07:43:19Z

@reyang I more than happy to help.
But I would need a little of guidance on what i need to do.. what makes it a bit more complicated it only happens on remote servers but I have admin access to the tests ones so I can install and dump .. just need instructions (never really done that before)

p10tyr · 2022-07-06T15:58:38Z

Just wanted to come back and say been running in test across several environments for 4 weeks now. No more issues of this.
Thank you very much!

p10tyr added the bug Something isn't working label May 11, 2022

reyang mentioned this issue May 19, 2022

Make the SDK more forgiving for ObjectDisposedException #3291

Merged

cijothomas closed this as completed in #3291 May 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An error occurred while writing to logger(s) - Safe handle has been closed #3269

An error occurred while writing to logger(s) - Safe handle has been closed #3269

p10tyr commented May 11, 2022

cijothomas commented May 11, 2022

p10tyr commented May 11, 2022 •

edited

Loading

cijothomas commented May 11, 2022 •

edited

Loading

p10tyr commented May 11, 2022

p10tyr commented May 12, 2022 •

edited

Loading

reyang commented May 18, 2022 •

edited

Loading

p10tyr commented May 18, 2022 •

edited

Loading

reyang commented May 19, 2022

p10tyr commented May 19, 2022 •

edited

Loading

cijothomas commented May 19, 2022

reyang commented May 19, 2022

p10tyr commented May 20, 2022

p10tyr commented Jul 6, 2022

An error occurred while writing to logger(s) - Safe handle has been closed #3269

An error occurred while writing to logger(s) - Safe handle has been closed #3269

Comments

p10tyr commented May 11, 2022

Bug Report

Symptom

Reproduce

Additional Context

cijothomas commented May 11, 2022

p10tyr commented May 11, 2022 • edited Loading

cijothomas commented May 11, 2022 • edited Loading

p10tyr commented May 11, 2022

p10tyr commented May 12, 2022 • edited Loading

reyang commented May 18, 2022 • edited Loading

p10tyr commented May 18, 2022 • edited Loading

reyang commented May 19, 2022

p10tyr commented May 19, 2022 • edited Loading

cijothomas commented May 19, 2022

reyang commented May 19, 2022

p10tyr commented May 20, 2022

p10tyr commented Jul 6, 2022

p10tyr commented May 11, 2022 •

edited

Loading

cijothomas commented May 11, 2022 •

edited

Loading

p10tyr commented May 12, 2022 •

edited

Loading

reyang commented May 18, 2022 •

edited

Loading

p10tyr commented May 18, 2022 •

edited

Loading

p10tyr commented May 19, 2022 •

edited

Loading