Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop when using a cache-and-network fetchPolicy #7436

Closed
alinpetrusca opened this issue Dec 9, 2020 · 24 comments
Closed

Infinite loop when using a cache-and-network fetchPolicy #7436

alinpetrusca opened this issue Dec 9, 2020 · 24 comments

Comments

@alinpetrusca
Copy link

alinpetrusca commented Dec 9, 2020

Intended outcome:
I have a component that retrieves some data using useQuery and renders a chart. Until now I used these options:

fetchPolicy: 'cache-and-network',
nextFetchPolicy: 'cache-first'

But now I realized that I have some data that may change and I need it to be up-to-date. I still want to use the cache, so I think that cache-and-network fetchPolicy would be the best in this case.

Actual outcome:
When I change the fetchPolicy to cache-and-network or network-only I end up having an infinite loop. This doesn't happen when I use no-cache.

This is how the component looks like:

export const GET_CHART_DATA = gql`
  query getChartData(
    $year: Int!
    $previousYear: Int!
    $currentYearStartDate: Date
    $currentYearEndDate: Date
    $previousYearStartDate: Date
    $previousYearEndDate: Date
    $product: String
    $channel: String
  ) {
    currentYearData: reportings(
      year: $year
      service: "OTHERS"
      date_Gte: $currentYearStartDate
      date_Lte: $currentYearEndDate
      productType_In: $product
      channelType_In: $channel
      dateType: "VISITS"
    ) {
      edges {
        node {
          date
          value
        }
      }
    }

    previousYearData: reportings(
      year: $previousYear
      service: "OTHERS"
      date_Gte: $previousYearStartDate
      date_Lte: $previousYearEndDate
      productType_In: $product
      channelType_In: $channel
      dateType: "VISITS"
    ) {
      edges {
        node {
          date
          value
        }
      }
    }

    budgetData(
      year: $year
      type: "OTHERS"
      date_Gte: $currentYearStartDate
      date_Lte: $currentYearEndDate
    ) {
      edges {
        node {
          date
          value
        }
      }
    }
  }
`;

const ChartContainer = ({ dateRangeVariables, filters }) => {
  const { loading, data, error } = useQuery(GET_CHART_DATA, {
    variables: {
      ...dateRangeVariables,
      product: filters.product.map(({ value }) => value).join(','),
      channel: filters.channel.map(({ value }) => value).join(',')
    },
    fetchPolicy: 'cache-and-network'
  });

  if (loading) {
    return <div>Loading</div>;
  }

  if (error) {
    return <div>Error</div>;
  }

  // structure data for chart...

  return <div>Chart</div>;
};

Versions
@apollo/client 3.2.0

I assume this is a cache issue, but I'm not sure if it's my fault or not and I don't know how to fix it.
Any ideas?

Thank you for your time!

Edit: I also think that defaultOptions is not working correctly. I have these options set when I create the apollo client instance:

defaultOptions: {
    query: {
      fetchPolicy: 'cache-and-network',
      nextFetchPolicy: 'cache-first'
    }
  }

First time when I render a component that queries some data I see the network request. If I switch to another view and then come back, I see no request (it's retrieved from the cache - cache-first?). Anyway, if I use fetchPolicy: 'cache-and-network' option inside useQuery I always see the request at component mount. Is this the expected behavior ?

@benjamn
Copy link
Member

benjamn commented Dec 9, 2020

@Alin13 It looks like these data are paginated using a Relay-style connection/edges API. Since you didn't mention anything about your typePolicies or field policies, I'm going to recommend you read our new documentation about pagination before proceeding, because the two subqueries are almost certainly clobbering each other's data in the cache. You can either keep their data totally separate with keyArgs, or you can use a field policy to make sure their data ends up merged together in a logical way. I know that probably doesn't make sense immediately, but you'll find a full explanation in the docs.

A few side notes:

  • The defaultOptions.query configuration does not apply to useQuery, which uses client.watchQuery behind the scenes. You probably want defaultOptions.watchQuery instead.
  • It's risky to put nextFetchPolicy in defaultOptions, since it applies to all queries. If you want to do this safely, I strongly recommend using a function for nextFetchPolicy: Improvements to options.nextFetchPolicy. #6893.

@alinpetrusca
Copy link
Author

alinpetrusca commented Dec 10, 2020

Well, I don't really need the pagination here, so I assume that if I remove the Relay structure it will work as expected.

I never used any type or field policies and it's a little hard to understand exactly what needs to be done there, but as I understand, the cache keys are generated using the name of the field plus the serialized arguments. In my case I have different year and dates variables, so I expected that there will be no conflict between them.

I'll try to remove the relay structure and I'll read the docs again.

Thank you!

Edit: @benjamn I changed the query so that I only have previousYearData subquery and I also removed some variables / filters:

query getChartData($previousYear: Int!, $previousYearEndDate: Date) {
  previousYearData: nausicaaReportings(year: $previousYear, service: "OTHERS", date_Lte: $previousYearEndDate, dateType: "VISITS") {
    edges {
      node {
        date
        turnover
      }
    }
  }
}

and these are the variables of the query:

{
    previousYear: 2019,
    previousYearEndDate: '2019-12-11'
  }

and I still have an infinite loop. So, I think is not related to the relay structure, but I'll try to remove it also.

Can it be related to the size of the response (because it's pretty big)?

Edit2: yes, I think this is related to the actual size of the response. If I limit the results to 100 entries everything works normally. Any ideas how to fix this?

@martinjlowm
Copy link

Can it be related to the size of the response (because it's pretty big)?

Edit2: yes, I think this is related to the actual size of the response. If I limit the results to 100 entries everything works normally. Any ideas how to fix this?

That is an interesting find! The places We see infinite loops are with big responses as well, i.e. metric data. For such types/resolvers we've set keyFields to false and disabled merging in the cache configuration. I don't think it has any influence though.

@alinpetrusca
Copy link
Author

alinpetrusca commented Dec 11, 2020

I managed to fix this by reducing the size of the response (we had over 30000 results initially, but we made some aggregations on the backend side and currently have ~500 results). But, I think this is still an issue with big responses. From my point of view, it doesn't make sense to have keyArgs or other configurations if by default all the arguments + the name of the field are used to cache the values.

For example, I have those two queries:

export const GET_DATA = gql`
  query getData($year: Int!, $currentYearStartDate: Date, $currentYearEndDate: Date) {
    currYearWebData: reportingsAggregations(
      year: $year
      date_Gte: $currentYearStartDate
      date_Lte: $currentYearEndDate
      channelType: "WEB"
      dateType: "DATE_TYPE_1"
    ) {
      value
    }
  }
`;

and this

export const GET_DATA = gql`
  query getData($year: Int!, $currentYearStartDate: Date, $currentYearEndDate: Date) {
    currYearWebData: reportingsAggregations(
      year: $year
      date_Gte: $currentYearStartDate
      date_Lte: $currentYearEndDate
      channelType: "WEB"
      dateType: "DATE_TYPE_2"
    ) {
      value
    }
  }
`;

The first query uses cache-and-network fetchPolicy, the second uses the default (cache-first). We can clearly see that dateType filter is different in each query. These two queries are triggered in the same time: I get the result from the first one, I get the result from the second one and after this the first one gets retriggered. Why? How to avoid this extra call?

@benjamn
Copy link
Member

benjamn commented Dec 11, 2020

@Alin13 Do you have any field policies (like keyArgs) configured for the Query.reportingsAggregations field, or are you using the default keyArgs behavior (include all arguments, sorted by key)?

@alinpetrusca
Copy link
Author

@benjamn No extra configuration, only the default.

@brainkim brainkim self-assigned this Apr 5, 2021
@nkahnfr
Copy link

nkahnfr commented Apr 12, 2021

Hi,

I am encountering the same issue with an infinite loop when a query response contains a lot of data.
I think that I reproduced the bug using your error template: nkahnfr/react-apollo-error-template@6da14ed.
You can change nbItems to a lower value (30k) and then everything is ok.

Let me know if you need more details.
Thanks for the great job and for your help.

@brainkim brainkim added 🚧 in-triage Issue currently being triaged ✔ confirmed and removed 🚧 in-triage Issue currently being triaged labels Apr 12, 2021
@brainkim
Copy link
Contributor

brainkim commented Apr 12, 2021

So the exact value for which this infinite loop starts happening is 32767. Suspiciously, this is equal to 2^15 - 1. Interesting, huh? Thanks for the reproduction @nkahnfr! I am investigating.

@brainkim
Copy link
Contributor

Related: #6888

@brainkim
Copy link
Contributor

One note for people currently struggling with this issue. It is likely fixed in 3.4 thanks to certain optimizations with regard to canonicalization of InMemoryCache result objects (#7439). I will have a larger post-mortem on this issue by end of day today, but in the meantime, please try the 3.4 beta and see if that helps.

@brainkim
Copy link
Contributor

brainkim commented Apr 16, 2021

Right so the TL;DR is that this issue is “fixed” in 3.4 and you should jump on the beta/rc when you get the chance.

So what’s happening:
As it turns out, when we read from the cache, we cache individual reads by selectionSet, ref and variables, as a layer of caching on top of the actual cache. We use cache hits to determine whether or not we should send the query to the server a second time, causing the infinite loop. This is done in QueryInfo.setDiff().

Why is this happening for large-ish values?

The optimism library, on which this library depends, has a default cache limit of 2 ^ 16. Exceeding this limit is what kicks off the loop.

Why is this happening for nested queries only?

The nested query stuff is a red herring. The reality is that the optimism caching is based on query, not just normalized entity, so the reason the array size limit was reduced to 2 ^ 15 - 1 is that we cached each item in the array for the parent query, cached each item in the array for the child query, and then there’s an extra cache value for the parent query itself, which gets us back to 2 ^ 16. You can negatively confirm this just by having the child query alone but with 2^16 values in the array. https://codesandbox.io/s/cache-and-network-reproduction-216-y0r17

Why does cache-and-network rely on referential equality?

This is the actual tough question, because it gets into the philosophy of fetch policies. My guess of what I think everyone is struggling with, is that we expect network request to only happen once per call, but really the “call” is a useQuery() hook and we’re in the world of React where ceaseless re-execution is the norm. So it’s tough to say on the apollo client side if we’re actually making another call, so for Apollo we fall back on this tenuous referential equality check here. How it actually triggers another request from the link is a more complicated question involving things like “reobservers” that I spent a bit of yesterday looking into but still don’t have the exact picture on.

In any case, this is unactionable because it’s likely fixed in 3.4, but I’m keeping an eye on it.

@chillyistkult
Copy link

I tried with 3.4.0-rc.2 and still see this issue. Is it confirmed that it should be fixed there @brainkim?

@brainkim
Copy link
Contributor

brainkim commented Jun 4, 2021

@chillyistkult Just checked against 3.4.0-rc.3 and it seems to still be fixed. If you have a reproduction for the behavior you’re seeing I am always happy to take a look!

@martinjlowm
Copy link

We upgraded from rc.0 to rc.6 today and noticed this pattern - I have yet to investigate if it was indeed this upgrade that caused it. I reverted our deployment and it started to settle back to normal after some time (once our users refreshed their page).

Screenshot 2021-06-11 at 20 11 56

@hwillson
Copy link
Member

@martinjlowm are there any additional details you can provide to help us determine if this is an Apollo Client issue? What does invocations mean in this context? Were you able to investigate further to determine if AC is the cause here?

@martinjlowm
Copy link

martinjlowm commented Jun 14, 2021

Hi @hwillson ,

I did have a closer look - I believe this was an effect of a thrown invariant error for some, but not something that applied for all clients. At least, it was easy for me to replicate if an API response didn't include any data (4XX errors). I saw:

  9: {
    file: "@apollo/client/cache/inmemory/writeToStore.js",
    node: new InvariantError("Missing field '" + resultFieldKey + "' in " + JSON.stringify(result, null, 2).substring(0, 100))
  },

and immediately after, an unrelated query started to go crazy.

Oh, and the invocations are API requests over a period of 5 minutes per data point.

I'll try to see if I can gather more information and perhaps even isolate it to a particular RC bump.

@martinjlowm
Copy link

martinjlowm commented Jun 14, 2021

  • rc.0: ✅
  • rc.1: ✅
  • rc.2: ❌
  • ...

I think perhaps it's this line: https://github.com/apollographql/apollo-client/compare/v3.4.0-rc.1...v3.4.0-rc.2#diff-5fb8ad16cfbcb51bc035d32b5963734561c440b48417a1c2dbf80a16098be67bR343 - data may be null here.

GitHub eats the pound in the link :<

Right off the bat, I cannot say why the client would attempt to keep refetching a triggered query from this though.

@hwillson
Copy link
Member

This is super helpful @martinjlowm - thanks for this!

@martinjlowm
Copy link

martinjlowm commented Jun 14, 2021

I managed to replicate some infinite loop on rc.6. I tried setting a couple of breakpoints just to see how the logic behaved. The variables to the query of interest does not change (same reference). Hope it's helpful - this is without any invariant errors.

You can see the red bars switching between two states as if there were two queries that raced one another.

https://www.icloud.com/iclouddrive/0H_M5_zserYxCERARXXiiIMRg#Screen_Recording_2021-06-14_at_22.56

I can confirm downgrading to rc.0 resolved it for us. ~1800 I deployed rc.6 again and a couple hours later, around midnight, I reverted just @apollo/client to rc.0 and that is where the drop of requests is. This one is CloudFront metrics of all resource requests in general.

Screenshot 2021-06-15 at 06 51 48

@hwillson hwillson added this to the June 2021 milestone Jun 15, 2021
benjamn added a commit that referenced this issue Jun 22, 2021
@sgentile
Copy link

sgentile commented Jan 6, 2022

We get infinite looping as well and the global overrides are still broken

@sgentile
Copy link

sgentile commented Jan 7, 2022

@Aliendreamer
Copy link

I can confirm this still happens 3.3.7 we had network-only and still it triggered.
Changing it to cache-first fix the problem, I implemented all the workarounds I know making errors null, using observables to refresh on auth error, returning explicitly nothing and still.
We haven't tried it with 3.5.7 we have to fix our typescript before switching to it as it throws us an error for void function.

@lohenyumnam
Copy link

Any solution to this problem guys ??

@UjinT34
Copy link

UjinT34 commented Sep 2, 2022

Try disabling babel-plugin-graphql-tag

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests