Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[otelecho] Add option to skip calling global handler #4419

Open
rattuscz opened this issue Oct 12, 2023 · 16 comments
Open

[otelecho] Add option to skip calling global handler #4419

rattuscz opened this issue Oct 12, 2023 · 16 comments
Labels
area: instrumentation Related to an instrumentation package enhancement New feature or request instrumentation: otelecho

Comments

@rattuscz
Copy link

Problem Statement

I want to be able to control whether otelecho middleware calls the global error handler or not.

Using multiple middlewares that cannot be configured this way the global error handler gets called multiple times for the exact same error. Handling this multiple processing of same error on global error handler seems to me as wrong way to go.

The only way to prevent this is to eat the error before this middleware is run after request is handled, which then does not append the error to span as attribute.

I believe this also goes against best practice of either handling the error or propagating, not both.

Proposed Solution

Add new option to middleware config to control this behavior, with default value resulting in same behavior as it is now to be BC.

Prior Art

In echo request logger middleware there is the exact same setting that controls calling of the global handler:
https://github.com/labstack/echo/blob/98a523756d875bc13475bcb6237f09e771cbe321/middleware/request_logger.go#L115

@erkanzileli
Copy link

I'm having the same issue. You're right.
The middleware calls the error handler and returns the error. When the middleware returns an error, echo calls the error handler again. so in this way, the error handler gets called twice. This results in a duplicated response body like below.

{
    "message": "content"
}
{
    "message": "content"
}

@MrAlias
Copy link
Contributor

MrAlias commented Nov 15, 2023

Instead of modifying the instrumentation, I would recommend you register a rate-limiting error handler. You can customize the behavior of how errors are handle however you want by using your own error handler. Adding to the API surface error of the instrumentation does not seem like a prudent way to solve an issue that can already be solved with a custom error handler.

@rattuscz
Copy link
Author

rattuscz commented Dec 7, 2023

@MrAlias I disagree that it's issue that should be solved in error handler, that is a workaround at best, because the real issue is calling error handler multiple times for the same error. That is definitely not the correct behavior.

Accepting the BC break and simply removing the line calling the error handler altogether as @pellared suggested is also fine as the error is propagated in the middleware chain up to be handled.

@pellared
Copy link
Member

pellared commented Dec 7, 2023

@rattuscz See #4420 (comment) by @MrAlias:

I'm not opposed to this.

Also we agree that it is acceptable (breaking) change as the Go module is not stable. I think that the users would like this change.

@rattuscz
Copy link
Author

rattuscz commented Dec 8, 2023

As long as it's addressed somehow I'm happy :-)

@shonigbaum
Copy link

Hey,
any news on this? We are facing the same issue. Will this be fixed soon? Or is there a workaround recommended?

@dmathieu
Copy link
Member

otelecho is lacking an owner, is deprecated and will be removed in a couple months.
#5550

Resolving this would be a good way for someone to step up before becoming an owner.

@shonigbaum
Copy link

Oh, thank you for mentioning. Hopefully there will be an owner soon. I'll take this topic into my team.

@rattuscz
Copy link
Author

That is unfortunate. Didn't realize there is a risk of that.

This issue is still not resolved mainly because of -> #4420 (comment)

I don't really know how to move forward without breaking it for someone, plus changing the behavior now would mean i would have problem to integrate it even in our own projects spending hours and writing tests for not much of a win.

So personally i would still prefer to remain backwards compatible by providing the config option that allows gradual adoption without suddenly breaking in minor/patch version change

@scorpionknifes
Copy link
Member

Hello, new codeowner here, taking a look into this issue and here is my view on this:

Thanks @rattuscz, for the investigation and attempt. The issue about the status code not being resolved if c.Error() is not called makes it hard to implement it in a way that just works out of the box.
After reading labstack/echo#1948 (comment), it is unclear how this should be done. It seems like it isn't just an issue with otelecho.
I believe the current workaround is to implement the following as described in labstack/echo#1948 (comment) by checking c.Response().Committed in the error handler.

I believe this also goes against best practice of either handling the error or propagating, not both.

I agree, however labstack/echo#1948 (comment) lists logger, bodydump and contrib prometheus that does both.

Does anyone have a use case where the above workaround won't work? I'm happy to look into implementing custom configuration options that could support these use cases. I'm looking to close this issue since skipping global handler call would break span status for otelecho. Thanks

@rytsh
Copy link

rytsh commented Nov 4, 2024

@scorpionknifes , what is the meaning of the calling error function? Can we just delete it for fix?

@rattuscz
Copy link
Author

rattuscz commented Nov 4, 2024

@scorpionknifes , what is the meaning of the calling error function? Can we just delete it for fix?

It's a bit tricky as if the error is not propagated you will not collect the error message.
You would have to integrate the error logging into the global error handler so the instrumentation would not be simple "add middleware".

at the same time if you are the "first" middleware (-> logger -> otel -> app) on the way out of app you have to call the error handler to ensure headers are set

if you are not first ( -> otel -> logger -> app ) you will most probably not have the error app produced as it will be consumed by logger.

In a sense you have to integrate otelecho and logger to know about each other and "connect" them correctly

@rytsh
Copy link

rytsh commented Nov 4, 2024

@rattuscz, I see error is always propagated because we return error.

If we don't call any error handler than it doesn't matter of the middleware order.
Calling error handler to making respond http calls should not be job of the this middleware.

@rattuscz
Copy link
Author

rattuscz commented Nov 6, 2024

If we don't call any error handler than it doesn't matter of the middleware order. Calling error handler to making respond http calls should not be job of the this middleware.

I thought that too, but since otelecho requires the http header to be already set (to properly log it) it means the global error handler must have been called in previous middlewares or by application itself.

Now since the error was handled by the global error handler, is the error propagated upwards to next middleware?
If not than otelecho does not know there was error and does not attach it to the span.
If yes than how does otelecho know it's app error and not error of other middleware? there can be errors that happen after handling the request on the way out of application.
suppose the logger failed. or the global error handler failed.

So the order of chaining the middlewares can matter a lot in these edge cases and you have to think about how the error handling should work.
Right now it seems everyone is happy with just calling global error handler multiple times and using the http response as detection if it is first time and ignoring any other errors that come up afterwards.

@rytsh
Copy link

rytsh commented Nov 10, 2024

@rattuscz I am not happy with that. For me this code, someone write wrong middleware and to cover his workaround added this.

Let's say someone block to return error in next middleware than it is not problem, maybe they want to do that. This not make sense to when we cover it, it looks doing something inside magic.

@rattuscz
Copy link
Author

@rytsh I was also not happy with it so I rewrote the otelecho middleware to not call the global handler and suddenly I realized i lost my error statuses in OTEL due to the otelecho being used after the logger that was calling the global handler.

So i bet any change will break most people code in some way or other.
So currently i do the same - ignoring second call for global handler and i quite hate it, but do not feel confident in "proper way to do it" to move forward with changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: instrumentation Related to an instrumentation package enhancement New feature or request instrumentation: otelecho
Projects
None yet
Development

No branches or pull requests

8 participants