-
Notifications
You must be signed in to change notification settings - Fork 101
Slow response times #178
Comments
We also noticed that the response time is quite slow (~1 seconds for the 95th ModelLatency) even on a very simple model. We were using the |
Same here, any advice? |
@eldonaldo If you don't need multi-model capabilities, I would build the docker image from this commit. |
@samueleresca many thanks, that already helped! it reduced the latency by 1s. If anyone as further hints how to reduce the latency, please let me know. |
@eldonaldo If your endpoint is under load and receives a lot of requests, you can also try to tweak the number of workers and threads of gunicorn, see this |
@samueleresca Thanks a lot for your help! The issue was fixed with this commit.(8ddbbdc) Additionally, the worker and threads worked perfectly when multiple request come in. |
Two fixes has been released
The two PRs have been merged: |
Is it changed something recently that may affect the response time of an endpoint?
I was using the docker image for TF 2.1 built around 5 months ago (This commit ). The response time was around 100ms (ModelLatency average on cloudwatch). I rebuilt the serving docker image yesterday and running with the same endpoint config and the same targz model, but now the response time is around 1000ms(ModelLatency average on cloudwatch). I'm also noticing a new record in the logs:
Is it changed something that could have impacted the performance at this level?
The text was updated successfully, but these errors were encountered: