Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to use stale cache entry over erroring out when schema cache load fails #3807

Closed
gr8routdoors opened this issue Oct 12, 2023 · 4 comments
Assignees
Labels
type/enhancement New feature or request

Comments

@gr8routdoors
Copy link
Contributor

gr8routdoors commented Oct 12, 2023

Feature or Problem Description

When running apicurio in a production envioronment, we would prefer to use a stale schema than error out entirely when trying to refresh the schema (i.e. intermittent network errors).

Proposed Solution

Add a to ERCache.faultTolerantRefresh option that instead of erroring out when a load error occurs, will log the exception and return the old value. Note that this should work alongside retries. Update AbstractSchemaResolver and so it can be configured to use this type of error handling.

This will be configured in DefaultSchemaResolver with the following boolean that defaults to false to preserve the existing behavior: apicurio.registry.fault-tolerant-refresh

Additional Context

I considered a design where this information gets propagated to the caller and the caller decides, but that (1) seems to be against the design taken in the retry functionality, and (2) would require a contract change to ERCache significantly increasing scope.

I'm happy to implement this functionality ASAP. Just wanted to align on a design @EricWittmann .

@gr8routdoors gr8routdoors added the type/enhancement New feature or request label Oct 12, 2023
@apicurio-bot
Copy link

apicurio-bot bot commented Oct 12, 2023

Thank you for reporting an issue!

Pinging @jsenko to respond or triage.

gr8routdoors pushed a commit to gr8routdoors/apicurio-registry that referenced this issue Oct 12, 2023
Adds the notion of a fault tolerance in load for production environments where it's better to use a stale cache value than die when a cache entry refresh fails.  See issue Apicurio#3807 for details.
gr8routdoors pushed a commit to gr8routdoors/apicurio-registry that referenced this issue Oct 12, 2023
Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails.  See issue Apicurio#3807 for details.
@gr8routdoors
Copy link
Contributor Author

I'd like to go a little further with this design and add an option for skipping retries when we already have a cache entry , in order to optimize performance for our production use case. I'll pile that on as a second commit so that it can be rolled back if there isn't alignment on this functionality.

@gr8routdoors
Copy link
Contributor Author

Nevermind, I just read that ERCache only retries on RateLimitedClientException, so adding skipRetryOnExistingValue will be a bit more than a simple change. I don't want to risk breaking other things that depend on its current behavior so I'll leave that for another PR

gr8routdoors pushed a commit to gr8routdoors/apicurio-registry that referenced this issue Oct 16, 2023
Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails.  See issue Apicurio#3807 for details.
carlesarnal pushed a commit that referenced this issue Oct 17, 2023
Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails.  See issue #3807 for details.

Co-authored-by: Devon Berry <[email protected]>
gr8routdoors added a commit to gr8routdoors/apicurio-registry that referenced this issue Oct 17, 2023
… (Apicurio#3823)

Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails.  See issue Apicurio#3807 for details.

Co-authored-by: Devon Berry <[email protected]>
carlesarnal pushed a commit that referenced this issue Oct 20, 2023
…a updates (#3839)

* feat(schema-cache): ERCache.configureFaultTolerantRefresh #3807 (#3823)

Adds the notion of a fault tolerance in refresh for production environments where it's better to use a stale cache value than die when a cache entry refresh fails.  See issue #3807 for details.

Co-authored-by: Devon Berry <[email protected]>

* fix(schema-resolver): caching of latest artifacts #3834

Caching of artifacts with no version (e.g. latest) did not work because reindex would use the new key that has the artifact version that was found in the lookup.  This code changes the default behavior to index both the artifact with its version and the latest/null version.  It also exposes a configuration property (`apicurio.registry.cache-latest`) where this behavior can be disabled for use cases where caching of latest is not desired.  See #3824 for details.

---------

Co-authored-by: Devon Berry <[email protected]>
@carlesarnal
Copy link
Member

Assuming the scope of this issue is done on the linked PR.

@carlesarnal carlesarnal moved this to Done in Registry 2.5 Nov 3, 2023
@carlesarnal carlesarnal self-assigned this Nov 7, 2023
@carlesarnal carlesarnal moved this from Done to Released in Registry 2.5 Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement New feature or request
Projects
No open projects
Status: Released
Development

No branches or pull requests

2 participants