Caching improvements? #2

m-mohr · 2021-07-26T07:27:17Z

Right now the caching time seems to be 5 minutes, which means that (due to a small request frequency) mostly every request is refreshing the cache, which leads to long loading times on the website and in the clients. For example, the Web Editor takes roughly 5 seconds to connect without (server-side) cache and under a second with (server-side) cache.

I think 5 minute cache TTL seems pretty low, do we really expect such a high frequency in changes for metadata (collections, processes, file formats, ...)? 60 minutes or even a day could be reasonable, too? And then I'd suggest to refresh the data using a cron job and always return the user cached data so that loading times are consistent.

Origin: https://github.com/openEOPlatform/architecture-docs/issues/22#issuecomment-886254097

soxofaan · 2021-08-18T10:25:50Z

That low cache minute level TTL was just for initial development purposes.

order of hours makes indeed more sense now that things are getting more concrete

m-mohr · 2021-09-01T09:12:57Z

Recently some back-ends seem to have some issues and that got fed through to the aggregator, which responded with errors or timeouts sometimes.

So in addition to the cron job proposal above, I think it's also a good idea to only clear the cache once a successful request has been made (or it's very outdated) so that people can still retrieve metadata although the back-end is temporarily offline. Metadata requests should always be delivered in a second or less, instead of taking up to a minute like it is right now from time to time.

soxofaan · 2021-09-21T10:32:12Z

a couple of ideas to improve the caching in the aggregator:

Find a way to flush all or subset of caches.
- Easy way would be through some custom HTTP endpoint. However: at the moment caching is done in memory and there are already multiple (gunicorn) workers in play (and we might scale up to load balanced setup on multiple machines), so a single "flush cache" HTTP request will not work
- centralize cache to memcache, redis, ... to allow easy cache flushing across multiple instances/workers
smarter caching, to handle temporary glitches better
- if cache is outdated: still keep it if it can not be updated yet due to hard failure
- if there is a minor failure in one of the back-ends: cache the result with shorter TTL than default

m-mohr · 2021-09-27T11:48:52Z

This is basically what we already do in the openEO Hub, but it's all JS with a MongoDB in the background and a daily crawling through a cron job. There we have some logic implemented for such cases where the data is not directly cleared in the db until new data successfully comes in, but clears it after a certain amount of failures... So if you need some insights into that feel free to contact @christophfriedrich.

soxofaan · 2021-10-12T10:23:53Z

Note: under #15 (83cfb7b), the number of gunicorn workers was increased to 10, which means that there are 10 separate processes at the moment that have their own in-memoy cache (containing the same things). Sharing the cache would be better for performance (now it could take long to warm up the cache) and consistency (because different workers might have different cached data)

soxofaan · 2022-06-03T08:05:20Z

Another idea to take into account: do caching at proxy/load balancing level instead of doing it in the flask app itself

For most unit tests we want no caching or simple dict based caching

start using memoizers in MultiBackendConnection and AggregatorBackendImplementation refactor memoizers some more to support this properly improve test coverage

For most unit tests we want no caching or simple dict based caching

start using memoizers in MultiBackendConnection and AggregatorBackendImplementation refactor memoizers some more to support this properly improve test coverage

soxofaan · 2022-09-19T13:18:15Z

merged initial usage of zookeeper based caching in 99332f7

caching JsonSerde needed support for serialization of custom classes (`_InternalCollectionMetadata` in this case)

had to add additional gzip'ing of json because process registry payload of process metadata is too large for default zookeeper limits

soxofaan · 2022-09-22T15:50:08Z

I think the most important caching issue listed here as the lack of a shared cache between all the workers, which made cache misses very frequent in practice.
I now added shared cache (through zookeeper at the moment), which should bring a serious caching performance improvement.

There are some more ideas left here, but I'd prefer close this general issue and move the remaining ideas to separate tickets for further discussion:

soxofaan added a commit that referenced this issue Aug 18, 2021

Issue #2 Increase default cache TTL to 60 minutes

20cd761

soxofaan added the performance label Sep 1, 2021

soxofaan self-assigned this Sep 1, 2021

soxofaan added a commit that referenced this issue Sep 6, 2021

Issue #2 Increase default cache TTL to 6 hours

bedaaac

soxofaan added the devops label Oct 12, 2021

soxofaan added a commit that referenced this issue Dec 8, 2021

Add (debug) logging of cache misses (#2)

1fe8de6

soxofaan mentioned this issue Jan 11, 2022

Parallelize back-end requests (e.g. for /jobs) #28

Closed

This was referenced Aug 3, 2022

Aggregator Evolution (Small Activity Proposal) Planning #52

Open

State management #60

Open

soxofaan added architecture evolution (SAP01) labels Aug 4, 2022

soxofaan added a commit that referenced this issue Aug 4, 2022

Issue #2 move current caching tools to dedicated caching module

5196ec6

soxofaan added a commit that referenced this issue Aug 5, 2022

Issue #2: Initial implementation of ZooKeeper based memoization

60451ec

soxofaan added a commit that referenced this issue Aug 5, 2022

Issue #2 Harden ZkMemoizer

e86b549

soxofaan added a commit that referenced this issue Aug 8, 2022

Issue #2: build ZkMemoizer from config

ed966d8

soxofaan added a commit that referenced this issue Aug 8, 2022

Issue #2 add simpler types of memoizers too

6ba3ebc

For most unit tests we want no caching or simple dict based caching

soxofaan added a commit that referenced this issue Aug 10, 2022

Issue #2 start using memoizers

f15b1a0

start using memoizers in MultiBackendConnection and AggregatorBackendImplementation refactor memoizers some more to support this properly improve test coverage

soxofaan added a commit that referenced this issue Aug 10, 2022

Issue #2 address self-review notes (#65)

2f489ea

soxofaan added a commit that referenced this issue Aug 17, 2022

fixup! Issue #2 address self-review notes (#65)

3c98ba8

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 move current caching tools to dedicated caching module

dea7057

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2: Initial implementation of ZooKeeper based memoization

d4c1dc3

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 Harden ZkMemoizer

1bef5ea

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2: build ZkMemoizer from config

2263a71

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 add simpler types of memoizers too

6c3408d

For most unit tests we want no caching or simple dict based caching

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2: Initial implementation of ZooKeeper based memoization

4a5b17a

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 Harden ZkMemoizer

718a339

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2: build ZkMemoizer from config

6dbfc7d

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 add simpler types of memoizers too

f890971

For most unit tests we want no caching or simple dict based caching

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 start using memoizers

57bd09e

start using memoizers in MultiBackendConnection and AggregatorBackendImplementation refactor memoizers some more to support this properly improve test coverage

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 address self-review notes (#65)

181dd11

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 more fine-tuning

a5b80b0

soxofaan added a commit that referenced this issue Sep 19, 2022

Issue #2 add initial zk based memoizers in dev and prod configs

49db945

soxofaan added a commit that referenced this issue Sep 20, 2022

Issue #2 start using new memoizers for collection metadata

78fa095

soxofaan added a commit that referenced this issue Sep 20, 2022

Issue #2 streamline logging in ZkMemoizer

52c9a86

soxofaan added a commit that referenced this issue Sep 20, 2022

fixup! Issue #2 streamline logging in ZkMemoizer

b5204aa

soxofaan added a commit that referenced this issue Sep 20, 2022

Issue #2 increase default ZkMemoizer TTL to 1 hour for now

6426b4a

soxofaan added a commit that referenced this issue Sep 21, 2022

Issue #2 port AggregatorCollectionCatalog fully to memoizers

a9d4a53

caching JsonSerde needed support for serialization of custom classes (`_InternalCollectionMetadata` in this case)

soxofaan added a commit that referenced this issue Sep 21, 2022

Issue #2 port AggregatorProcessing fully to memoizers

4a1e8bf

had to add additional gzip'ing of json because process registry payload of process metadata is too large for default zookeeper limits

soxofaan added a commit that referenced this issue Sep 21, 2022

Issue #2 decrease zlib compression threshold to 100k

8d20030

soxofaan added a commit that referenced this issue Sep 21, 2022

Issue #2 memoizer_from_config: add ChainedMemoizer support

7567630

soxofaan added a commit that referenced this issue Sep 21, 2022

Issue #2 Memoizers: hardening and more logging

b07dc66

soxofaan added a commit that referenced this issue Sep 21, 2022

Issue #2 try out two level caching (local memory + zk)

d8b490c

soxofaan added a commit that referenced this issue Sep 22, 2022

Issue #2 also use two level caching in production config

2ba26a9

soxofaan added a commit that referenced this issue Sep 22, 2022

Issue #2 increase caching ttls (1h local mem cache, 24h zk cache)

93206c3

soxofaan added a commit that referenced this issue Sep 22, 2022

Issue #2 add debug logging of cache invalidation

b9e241a

soxofaan added a commit that referenced this issue Sep 22, 2022

Issue #2 use namespace in all memoizers (easier debugging)

4a648da

This was referenced Sep 22, 2022

Flush/invalidate caches without redeploy #73

Open

Smart/adaptive caching #74

Closed

soxofaan closed this as completed Sep 22, 2022

soxofaan added a commit that referenced this issue Sep 23, 2022

Issue #2 bump version to 0.5.4a1

137a12c

soxofaan mentioned this issue Sep 23, 2022

Heavy /processes due to mask_scl_dilation Open-EO/openeo-python-driver#139

Open

soxofaan added a commit that referenced this issue Nov 14, 2022

Issue #2 decrease local (mem) cache ttl to 5 minutes

804ca22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching improvements? #2

Caching improvements? #2

m-mohr commented Jul 26, 2021 •

edited

Loading

soxofaan commented Aug 18, 2021

m-mohr commented Sep 1, 2021 •

edited

Loading

soxofaan commented Sep 21, 2021

m-mohr commented Sep 27, 2021

soxofaan commented Oct 12, 2021

soxofaan commented Jun 3, 2022

soxofaan commented Sep 19, 2022

soxofaan commented Sep 22, 2022

Caching improvements? #2

Caching improvements? #2

Comments

m-mohr commented Jul 26, 2021 • edited Loading

soxofaan commented Aug 18, 2021

m-mohr commented Sep 1, 2021 • edited Loading

soxofaan commented Sep 21, 2021

m-mohr commented Sep 27, 2021

soxofaan commented Oct 12, 2021

soxofaan commented Jun 3, 2022

soxofaan commented Sep 19, 2022

soxofaan commented Sep 22, 2022

m-mohr commented Jul 26, 2021 •

edited

Loading

m-mohr commented Sep 1, 2021 •

edited

Loading