Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registration pages can pages be 404 if query occurs during publish #7134

Open
joelverhagen opened this issue May 4, 2019 · 4 comments
Open

Comments

@joelverhagen
Copy link
Member

joelverhagen commented May 4, 2019

I've had this on the back of my mind for ages but I've seen this occur a couple of times with a program I wrote that hits NuGet.org APIs.

If a package ID has it's version not inlined in the registration index, this means the version metadata is in pages. For nuget.org, this occurs after 128 versions. This means clients need to do an additional HTTP request to get a page. If a new version is being published, it's possible one or more pages has its bound change meaning that the URL changes. This can cause a 404.

In short, this is what can happen:

  1. User queries for registration index and sees page [N, M].
  2. Server side adds a new version V such that M < V
  3. Server changes page bounds to [N, V]
  4. Server deletes page [N, M].
  5. User queries for page [N, M] and encounters a 404.

Looking at client behavior, looks like this might cause an InvalidDataException, but I would want to test.

https://github.com/NuGet/NuGet.Client/blob/ba56e6913c73457a48906bca58a4e3e33dae1b15/src/NuGet.Core/NuGet.Protocol/DependencyInfo/RegistrationUtility.cs#L90
https://github.com/NuGet/NuGet.Client/blob/5d1af18e560b5f381626b5a7e9ce2acacb6211db/src/NuGet.Core/NuGet.ProjectModel/JsonUtility.cs#L32

I think this may be a poor user experience, but need to investigate more. We could mitigate this by:

  1. Making client retry on 404 returned from page fetch.
  2. Make the old pages last a while... maybe forever.
  3. Making page URLs not contain the versions and just be indexes.
    1. This can lead to bait-and-switch of what versions are in a page.

We can determine how frequently this is occurring by looking at CDN logs, I think. An interesting time to look would be the point in time when the most popular package with more than 128 versions had its latest version processed by Catalog2Registration.

@joelverhagen joelverhagen changed the title Registration pages can pages be not found if query occurs during publish Registration pages can pages be 404 if query occurs during publish May 4, 2019
@loic-sharma
Copy link
Contributor

loic-sharma commented May 5, 2019

This issue may be exacerbated by the client’s HTTP caching policy. Would option 3 also require a client update?

@joelverhagen
Copy link
Member Author

Not sure if exacerbated. Index and pages are all downloaded at the same time for client. Could be that it's more likely for them to not see the page change due to caching. But yeah painful if they cache the old index.

No number three is not a client update. Power of linked data 😀

@loic-sharma
Copy link
Contributor

loic-sharma commented May 5, 2019

Ah so NuGet/Home#8058 is intentional to prevent cache inconsistencies then?

If the client loads a page, but that page doesn’t have the expected version, will the client go to the next page automatically? If so, that’s pretty nifty :)

@joelverhagen
Copy link
Member Author

Ah so NuGet/Home#8058 is intentional to prevent cache inconsistencies then?

I think I was overly broad when I said:

Index and pages are all downloaded at the same time for client.

There are cases when all of the pages are download and cases when only some are downloaded. Basically follow references up from RegistrationUtility.LoadRanges to see all the variety. Hint: there's a LOT of variety.

If the client loads a page, but that page doesn’t have the expected version, will the client go to the next page automatically? If so, that’s pretty nifty :)

Not sure what you mean here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants