Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit the number of report decodings that we do in parallel, in multitenant code #3256

Closed
bboreham opened this issue Jul 6, 2018 · 3 comments
Labels
performance Excessive resource usage and latency; usually a bug or chore

Comments

@bboreham
Copy link
Collaborator

bboreham commented Jul 6, 2018

Currently we have this call sequence:

awsCollector.Report()
  awsCollector.getReports()
    S3Store/MemcacheClient.FetchReports()
      // fetch and decode all reports in parallel
  Merger.Merge()

When you have a lot of reports this is maximally bad for memory consumption.

Since Merge() is now serial we could return compressed blobs from FetchReports() and decode just before they are merged. It's probably more effective to have some parallelism - fetch, decode and merge subsets in parallel, then merge the outputs of those subsets.

@bboreham bboreham added the performance Excessive resource usage and latency; usually a bug or chore label Jul 6, 2018
@bboreham
Copy link
Collaborator Author

I took a look at this, and it's not simple to get the benefit because we cache all the individual reports, so we're going to keep them all in memory anyway.

Maybe decode and merge reports in 5-second groups then cache the merge of those groups, instead of caching the individual reports?

@rade
Copy link
Member

rade commented Dec 24, 2018

Maybe [...] merge reports in 5-second groups then cache the merge of those groups, instead of caching the individual reports?

That would be similar to what the ordinary (non-AWS) Collector is doing.

@bboreham
Copy link
Collaborator Author

This has been fixed by #3671. It is still unlimited in the sense that if the app receives a large number of requests in parallel it will start to handle them all, but that is a different feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Excessive resource usage and latency; usually a bug or chore
Projects
None yet
Development

No branches or pull requests

2 participants