Skip to content
This repository has been archived by the owner on May 23, 2024. It is now read-only.

Slow response times #178

Open
samueleresca opened this issue Nov 12, 2020 · 7 comments
Open

Slow response times #178

samueleresca opened this issue Nov 12, 2020 · 7 comments

Comments

@samueleresca
Copy link
Contributor

samueleresca commented Nov 12, 2020

Is it changed something recently that may affect the response time of an endpoint?

I was using the docker image for TF 2.1 built around 5 months ago (This commit ). The response time was around 100ms (ModelLatency average on cloudwatch). I rebuilt the serving docker image yesterday and running with the same endpoint config and the same targz model, but now the response time is around 1000ms(ModelLatency average on cloudwatch). I'm also noticing a new record in the logs:

  | 2020-11-12T11:48:57.327+00:00 | INFO:tfs_utils:sagemaker tfs attributes:
-- | -- | --
  | 2020-11-12T11:48:57.327+00:00 | {}

Is it changed something that could have impacted the performance at this level?

@henryhu666
Copy link

We also noticed that the response time is quite slow (~1 seconds for the 95th ModelLatency) even on a very simple model. We were using the 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.0-cpu image in multi-model mode

@eldonaldo
Copy link

Same here, any advice?

@samueleresca
Copy link
Contributor Author

samueleresca commented Feb 22, 2021

@eldonaldo If you don't need multi-model capabilities, I would build the docker image from this commit.

@eldonaldo
Copy link

@samueleresca many thanks, that already helped! it reduced the latency by 1s.

If anyone as further hints how to reduce the latency, please let me know.

@samueleresca
Copy link
Contributor Author

@eldonaldo If your endpoint is under load and receives a lot of requests, you can also try to tweak the number of workers and threads of gunicorn, see this

@davesean
Copy link

davesean commented Feb 23, 2021

@samueleresca Thanks a lot for your help! The issue was fixed with this commit.(8ddbbdc) Additionally, the worker and threads worked perfectly when multiple request come in.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants