Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing oracles cause promise growth #9923

Closed
warner opened this issue Aug 19, 2024 · 2 comments
Closed

missing oracles cause promise growth #9923

warner opened this issue Aug 19, 2024 · 2 comments
Assignees
Labels
bug Something isn't working Inter-protocol Overarching Inter Protocol Vaults VaultFactor (née Treasury)

Comments

@warner
Copy link
Member

warner commented Aug 19, 2024

In https://github.com/Agoric/agoric-private/issues/169 we observed constant growth in the number of unresolved promises, in a testnet that was mainfork-cloned from our mainnet, but which lacked its own oracles, and thus was not supplied with price quotes. My analysis notes from that ticket:

My hunch here is that a price-feed oracle has not been re-attached after the upgrade yet, and each time the lockOraclePrices wakeup fires and sees this state, it repeats a bunch of work that should have only been done once. Like, it's attempting to create new QuoteNotifiers instead of re-using the ones created during the previous wakeup. I wonder if all these QuoteNotifiers are waiting for the same thing, and v67 and v69 will resolve a huge backlog of getUpdateSince promises the moment an oracle connects and sends in the first price.

I'd like to know how v67/v69 are reacting to the makeQuoteNotifier() call: are they looking up the supplied brand pair in a table? I'd think the calls should be idempotent, but the high kref number suggests that we're making up brand new ones for each request. And I see over 50 calls to makeQuoteNotifier() with the same arguments in this slogfile, spaced almost exactly 3600 seconds apart.

I think this is where we summon @Chris-Hibbert.

We don't expect to be in this situation on a real network, at least not for very long. Oracles will do their job, prices will flow, auctions will happen. But:

  • 1: we should investigate to see if we might be in this state for a (hopefully short) interval after an upgrade which detaches the pricefeeds, until the oracles get reattached, and if we need to be worried about a big burst of activity when they finally reattach or submit their first quote or whatever
  • 2: if a denom ever fades away out of neglect, such that nobody is using it any more and doesn't care, it would be nice if chain sustainability isn't impacted by oracles failing to continue to submit prices. We should not default to constant resource growth, only avoided by constant submission of quotes; the system should be idle and stable by default.

The task is to understand exactly where these promises are coming from, what action whose lack is causing them to be generated, and then fix the code to prevent the long-term growth.

@warner warner added the bug Something isn't working label Aug 19, 2024
@warner warner added Inter-protocol Overarching Inter Protocol Vaults VaultFactor (née Treasury) labels Aug 19, 2024
@warner
Copy link
Member Author

warner commented Aug 30, 2024

Ok so vaultManager.js has this function named observeQuoteNotifier:

        observeQuoteNotifier() {
          const { state } = this;

          const { collateralBrand, collateralUnit, debtBrand, storageNode } =
            state;
          const ephemera = collateralEphemera(collateralBrand);

          const quoteNotifier = E(priceAuthority).makeQuoteNotifier(
            collateralUnit,
            debtBrand,
          );
          // @ts-expect-error XXX quotes
          ephemera.storedQuotesNotifier = makeStoredNotifier(
            // @ts-expect-error XXX quotes
            quoteNotifier,
            E(storageNode).makeChildNode('quotes'),
            marshaller,
          );
          trace(
            'helper.start() awaiting observe storedQuotesNotifier',
            collateralBrand,
          );
          // NB: upon restart, there may not be a price for a while. If manager
          // operations are permitted, ones that depend on price information
          // will throw. See https://github.com/Agoric/agoric-sdk/issues/4317
          const quoteWatcher = harden({
            onFulfilled(value) {
              ephemera.storedCollateralQuote = value;
            },
            onRejected() {
              // NOTE: drastic action, if the quoteNotifier fails, we don't know
              // the value of the asset, nor do we know how long we'll be in
              // ignorance. Best choice is to disable actions that require
              // prices and restart when we have a new price. If we restart the
              // notifier immediately, we'll trigger an infinite loop, so try
              // to restart each time we get a request.

              ephemera.storedCollateralQuote = null;
            },
          });
          void watchQuoteNotifier(quoteNotifier, quoteWatcher);
        },

The goal is to populate ephemera.storedCollateralQuote. Every time this function is called, it will always send a new makeQuoteNotifier() to the priceAuthority, and will always build a new makeStoredNotifier() around it, and will alway uses watchQuoteNotifier() to glue an ephemeral quoteWatcher to that, and that watcher is the one place where storedCollateralQuote can be filled.

This function is called from several places in vaultManager that need a stored quote, with a clause like:

          if (!storedCollateralQuote) {
            facets.helper.observeQuoteNotifier();
            throw Fail`maxDebtFor called before a collateral quote was available for ${collateralBrand}`;
          }

and at least one of those places can be triggered by an hourly timer wakeup.

Which means every timer wakeup will provoke this new set of objects and promises, until some QuoteNotifier finally reports either a price or an error. And something deeper in the process is unable to do either without a PushPrice coming in from off-chain, which isn't happening on the testnet.

shape of a fix

I think we need that const quoteNotifier = E(priceAuthority).makeQuoteNotifier() to have a wider scope.

We want observeQuoteNotifier() to not make a new request if there's already a request pending. And we need it to be reliable in the face of upgrade, and errors.

Maybe something that starts with:

if (!ephemera.quoteNotifier) {
  ephemera.quoteNotifier = E(priceAuthority).makeQuoteNotifier();
  ephemera.quoteNotifier.catch(_err => ephemera.quoteNotifier = undefined);
}

It should probably defer building the rest of the wrapping objects until we get a quote notifier back. The goal is to not ever build redundant objects, or make redundant requests.

relation to promise growth

Note that I still don't know exactly which promise is the one being accumulated. I know it's one of the getUpdateSince that gets sent to aQuoteNotifier, and I know that we're making new QuoteNotifiers each cycle, but I don't know if the getUpdateSinces are all being sent to some pre-existing QuoteNotifier, or if each one is going to a new QuoteNotifier.

@Chris-Hibbert
Copy link
Contributor

Resolved by #10668

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Inter-protocol Overarching Inter Protocol Vaults VaultFactor (née Treasury)
Projects
None yet
Development

No branches or pull requests

2 participants