-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High performance degradation by instrumentation-type DECORATE_QUEUES #3158
Comments
@jnt0r can you please provide a small isolated application that reflects your environment and demonstrates the mentioned problem? |
Hi @OlegDokuka, currently I don't have time to create a sample application. It's basically a Spring Cloud Gateway application with two explicit thread jumps (as described in my initial description) where we switched the instrumentation-type from default to decorate_on_queues which lead to reproducible huge performance impacts (CPU usage, response time, ...). |
We'll keep the issue in the backlog, if you find the time to prepare a reproducer, we'd be happy to have a look and profile. Without an isolated reproducer it will be difficult to address the issue. |
One code snippet was badly closed in the previous PR.
We figured out that it was more of a documentation missmatch between reactor-core and spring cloud gateway. The defaults for the instrumentation-type differ because spring cloud gateway overrides the default to manual. That makes it clear why we faced such a huge performance impact by switching to decorate_queues. |
Thank you for following up. That makes sense. I think we can consider closing the issue then? |
Yes we can close the issue. I will suggest to point out the differences in
the spring cloud gateway documentation more clearly.
Dariusz Jędrzejczyk ***@***.***> schrieb am Do., 6. Okt.
2022, 12:10:
… Thank you for following up. That makes sense. I think we can consider
closing the issue then?
—
Reply to this email directly, view it on GitHub
<#3158 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJHHNXGV6Y6YO5VWRAH6T2DWB2QRTANCNFSM565DDFMQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@jnt0r please go ahead and file a report in https://github.com/spring-cloud/spring-cloud-gateway/issues then, thanks. I see @OlegDokuka already closed this issue. |
We had an issue with spring-sleuth context not been propagated to every thread so we switch the instrumentation type to DECORATE_QUEUES which solved this issue in our tests. In production we faced high performance degredation, long response times and failed requests (aborted SocketChannels etc.). The kubernetes pods requested more than 2 to 3 times CPU than the previous version with the default instrumentation type.
We see the following exceptions thrown:
Expected Behavior
As mentioned in the Documentation I excpected the same funtional behaviour with maybe slightly performance impact (the documentation mentions a LOW performance impact).
Actual Behavior
The CPU consumption of the application drastically increases with this instrumentation type and we see many Exceptions about closed and aborted SocketChannels.
Steps to Reproduce
Switch to instrumentation-type DECORATE_QUEUES and fire like 200 requests per second to one instance of the application.
Possible Solution
We took a short look into the implementation and thought about the usage of a double linked LinkedHashMap for this feature. Maybe a LinkedHashMap is not the right choice here?
Your Environment
The application routes requests depending on some parameters to other services (normally very short resquest/response-times). Some parameters are cached so we don't need to check them every time. When we need to check them (cache evicted) we explicitely start a new thread because the check consists of a synchronous API call:
.onCacheMissResume(() -> Mono.fromCallable(() -> kubernetesRepository.callTokenReviewEndpoint(authorizationHeader)) .subscribeOn(Schedulers.boundedElastic())) ....
In these cases the sleuth-context was missing with the default instrumentation-type. With decorate_queues the context is present.
netty
, ...): Spring Boot 2.6.4, Spring Cloud 2021.0.1java -version
): 11.0.16uname -a
): Docker Image ubi8/openjdk-11The text was updated successfully, but these errors were encountered: