Option to use stale cache entry over erroring out when schema cache load fails #3807

gr8routdoors · 2023-10-12T13:42:58Z

Feature or Problem Description

When running apicurio in a production envioronment, we would prefer to use a stale schema than error out entirely when trying to refresh the schema (i.e. intermittent network errors).

Proposed Solution

Add a to ERCache.faultTolerantRefresh option that instead of erroring out when a load error occurs, will log the exception and return the old value. Note that this should work alongside retries. Update AbstractSchemaResolver and so it can be configured to use this type of error handling.

This will be configured in DefaultSchemaResolver with the following boolean that defaults to false to preserve the existing behavior: apicurio.registry.fault-tolerant-refresh

Additional Context

I considered a design where this information gets propagated to the caller and the caller decides, but that (1) seems to be against the design taken in the retry functionality, and (2) would require a contract change to ERCache significantly increasing scope.

I'm happy to implement this functionality ASAP. Just wanted to align on a design @EricWittmann .

The text was updated successfully, but these errors were encountered:

apicurio-bot · 2023-10-12T13:43:01Z

Thank you for reporting an issue!

Pinging @jsenko to respond or triage.

Adds the notion of a fault tolerance in load for production environments where it's better to use a stale cache value than die when a cache entry refresh fails. See issue Apicurio#3807 for details.

Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails. See issue Apicurio#3807 for details.

gr8routdoors · 2023-10-12T16:55:34Z

I'd like to go a little further with this design and add an option for skipping retries when we already have a cache entry , in order to optimize performance for our production use case. I'll pile that on as a second commit so that it can be rolled back if there isn't alignment on this functionality.

gr8routdoors · 2023-10-12T17:24:05Z

Nevermind, I just read that ERCache only retries on RateLimitedClientException, so adding skipRetryOnExistingValue will be a bit more than a simple change. I don't want to risk breaking other things that depend on its current behavior so I'll leave that for another PR

Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails. See issue Apicurio#3807 for details.

Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails. See issue #3807 for details. Co-authored-by: Devon Berry <[email protected]>

… (Apicurio#3823) Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails. See issue Apicurio#3807 for details. Co-authored-by: Devon Berry <[email protected]>

…a updates (#3839) * feat(schema-cache): ERCache.configureFaultTolerantRefresh #3807 (#3823) Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails. See issue #3807 for details. Co-authored-by: Devon Berry <[email protected]> * fix(schema-resolver): caching of latest artifacts #3834 Caching of artifacts with no version (e.g. latest) did not work because reindex would use the new key that has the artifact version that was found in the lookup. This code changes the default behavior to index both the artifact with its version and the latest/null version. It also exposes a configuration property (`apicurio.registry.cache-latest`) where this behavior can be disabled for use cases where caching of latest is not desired. See #3824 for details. --------- Co-authored-by: Devon Berry <[email protected]>

carlesarnal · 2023-11-03T15:17:31Z

Assuming the scope of this issue is done on the linked PR.

gr8routdoors added the type/enhancement New feature or request label Oct 12, 2023

apicurio-bot bot added the triage/needs-triage label Oct 12, 2023

gr8routdoors mentioned this issue Oct 12, 2023

feat(schema-cache): ERCache.configureFaultTolerantRefresh #3807 #3809

Closed

gr8routdoors mentioned this issue Oct 16, 2023

feat(schema-cache): ERCache.configureFaultTolerantRefresh #3807 #3823

Merged

carlesarnal closed this as completed Nov 3, 2023

apicurio-bot bot removed the triage/needs-triage label Nov 3, 2023

carlesarnal added this to Registry 2.5 Nov 3, 2023

carlesarnal moved this to Done in Registry 2.5 Nov 3, 2023

carlesarnal self-assigned this Nov 7, 2023

carlesarnal moved this from Done to Released in Registry 2.5 Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to use stale cache entry over erroring out when schema cache load fails #3807

Option to use stale cache entry over erroring out when schema cache load fails #3807

gr8routdoors commented Oct 12, 2023 •

edited

Loading

apicurio-bot bot commented Oct 12, 2023

gr8routdoors commented Oct 12, 2023

gr8routdoors commented Oct 12, 2023

carlesarnal commented Nov 3, 2023

Option to use stale cache entry over erroring out when schema cache load fails #3807

Option to use stale cache entry over erroring out when schema cache load fails #3807

Comments

gr8routdoors commented Oct 12, 2023 • edited Loading

Feature or Problem Description

Proposed Solution

Additional Context

apicurio-bot bot commented Oct 12, 2023

gr8routdoors commented Oct 12, 2023

gr8routdoors commented Oct 12, 2023

carlesarnal commented Nov 3, 2023

gr8routdoors commented Oct 12, 2023 •

edited

Loading