NuGet efficiency #8679

ryanbrandenburg · 2023-12-29T23:52:31Z

I saw some problems related to timeouts when running Dependabot for NuGet packages. It seems that we were using the query http API in places where we already knew exactly which package it was we wanted. The query API can be quite slow on some nuget registry implementations which can exceed the 20 second timeout for that call. Rather than extend the timeout and tax NuGet registries I'm trying to use the Registration API which returns faster because it already knows which package it's working against. This has the downside that it returns a lot more JSON that we need to parse, but given the slowness of the query API I've found it to be a significant savings.

While working through this I found some other performance problems which might affect particularly large or nested repositories, such as the index call for a NuGet repo being made repeatedly even though it doesn't vary, and instances where caching results can keep us from walking transitive dependencies multiple times.

…into dev/rybrande/NugetJsonParse

nuget/lib/dependabot/nuget/update_checker/repository_finder.rb

…into dev/rybrande/NugetJsonParseRebased

bdragon

This looks good to me!

The only thing I would recommend double-checking is the use of Hash#[] vs. Hash#fetch() without a default, probably mainly in the response-parsing code in Dependabot::Nuget::NugetClient. fetch() without a default will raise if the field is omitted, so we just want to make we always expect those fields to exist.

ryanbrandenburg · 2024-01-08T20:06:10Z

@bdragon I double-checked the NugetClient usage of .fetch and I believe that all those usages are required fields (in their given contexts). I don't have much of a Ruby background, is it considered better form to use hash[key] or .fetch(key, nil)? I had assumed hash[key] was cleaner when you weren't sure if you have the KVP (and I also am under the impression that the two are interchangeable).

bdragon · 2024-01-08T21:05:06Z

@ryanbrandenburg

In general they will do the same thing, but there is a subtle difference. When you create a hash, you have the option to provide a block that will be used to specify the return value when a caller tries to access a key that doesn't exist with hash[key] (this is the Hash#default_proc). For example, you could do this:

irb(main):001:0> a = {}
=> {}
irb(main):002:0> a[:missing]
=> nil
irb(main):003:0> b = Hash.new { |h, k| "unknown key '#{k}'" }
=> {}
irb(main):004:0> b[:missing]
=> "unknown key 'missing'"

The default behavior for #default_proc is to just return nil, but as you can see the behavior of hash[key] kind of depends on how it was created.

When a key might be missing hash[key] feels cleaner to me, too, but I guess technically it's not guaranteed to return nil when the key is missing if the hash has a custom default_proc for some reason. In contrast, hash.fetch(key, nil) is always guaranteed to return nil if the key is missing.

The main thing I look out for is hash.fetch(key) without a default value, since that will raise a KeyError if the key is missing and this has bitten us before.

ryanbrandenburg · 2024-01-08T21:14:19Z

Ah, I'm glad I asked! So it sounds like if we were taking Hash's from outside callers (as a library for usage by other projects) we might use hash.fetch(key, nil) to guarantee behavior but for an internally contained project like this hash[key] should be fine when nil-return is desired and hash.fetch(key) should be fine when a result is "known" to exist. That being the case I think the PR adheres to those guidelines so we're good there.

JoeRobich · 2024-01-08T21:34:35Z

nuget/lib/dependabot/nuget/update_checker/repository_finder.rb

@@ -62,7 +64,9 @@ def build_url_for_details(repo_details)
          body = remove_wrapping_zero_width_chars(response.body)
          base_url = base_url_from_v3_metadata(JSON.parse(body))
          resolved_base_url = base_url || repo_details.fetch(:url).gsub("/index.json", "-flatcontainer")
-          search_url = search_url_from_v3_metadata(JSON.parse(body))
+          parsed_json = JSON.parse(body)


We should move this up so we can use it when getting the base_url from the response body.

JoeRobich · 2024-01-08T21:41:54Z

nuget/lib/dependabot/nuget/update_checker/repository_finder.rb

@@ -54,6 +55,7 @@ def find_dependency_urls
            end.compact.uniq
        end

+        # rubocop:disable Metrics/AbcSize
        def build_url_for_details(repo_details)


Not for this PR, but I think this method would is a good candidate for some cleanup. Doing a bit more than building a url. Would we want to encapsulate parsing the repo metadata response somewhere else?

…into dev/rybrande/NugetJsonParseRebased

bdragon · 2024-01-09T18:50:35Z

Is this ready to deploy (once up-to-date with the base branch)?

ryanbrandenburg · 2024-01-09T19:18:10Z

Is this ready to deploy (once up-to-date with the base branch)?

I believe so.

ryanbrandenburg · 2024-01-09T22:20:53Z

Looks like there was some unreliability in the smoke tests? It passed on re-run and doesn't seem related to my changes.

bdragon · 2024-01-09T23:03:56Z

I've deployed the updater with this change so I'm going to merge this PR. I'll cut a release of the dependabot gems on Thursday and it will include this change.

Resilience

a0276bf

github-actions bot added the L: dotnet:nuget NuGet packages via nuget or dotnet label Dec 29, 2023

ryanbrandenburg added 21 commits December 29, 2023 15:56

PR cleanup

1302377

Lint rules

8b9a614

Lint rules

00ccb43

Lint rules

b470486

Lint cleanup

fe651b8

Lint cleanup

ed8a1f9

Fix nil assignment

b7fd729

Lint fix

e3576b3

Merge branch 'main' of https://github.com/dependabot/dependabot-core …

946df05

…into dev/rybrande/NugetJsonParse

adjust spec

df81998

Repo details

c130032

Lint

c10bc75

Cleanup

3c54a4b

Cleanup

7e6bf5e

Cleanup

bac5898

Cleanup extra cache items

8db360b

Fix tests

665119f

Merge branch 'main' of https://github.com/dependabot/dependabot-core …

d4440eb

…into dev/rybrande/NugetJsonParse

Lint

87c7bc9

Lint fixes

bd37d3f

Fix tests

44e95fc

brettfo requested changes Jan 3, 2024

View reviewed changes

nuget/lib/dependabot/nuget/update_checker/repository_finder.rb Show resolved Hide resolved

ryanbrandenburg added 6 commits January 3, 2024 10:38

registration test

bb8b2d1

PR feedback

4b00f86

registration test

cb4e47d

Testing

5ef3f21

Reigstrations

afaf2d2

Expand registration url

e6061cc

ryanbrandenburg added 12 commits January 5, 2024 15:19

PR feedback

8be7546

Lint cleanup

28319fc

Lint cleanup

a7608ee

Fix tests

f19e73c

Guard clause

cb456f2

Fix tests

4e35f4a

Lint length

b63e8ce

Non-standard API

10f251b

No results

e65660e

Adjust with-results index.json

2c4351c

Stub next request

ccb3dcb

Merge branch 'main' of https://github.com/dependabot/dependabot-core …

b50a764

…into dev/rybrande/NugetJsonParseRebased

bdragon approved these changes Jan 8, 2024

View reviewed changes

JoeRobich reviewed Jan 8, 2024

View reviewed changes

JoeRobich approved these changes Jan 8, 2024

View reviewed changes

ryanbrandenburg added 3 commits January 9, 2024 09:21

PR feedback

80e9898

Lint cleanup

d48cf85

Merge branch 'main' of https://github.com/dependabot/dependabot-core …

f2dc11a

…into dev/rybrande/NugetJsonParseRebased

bdragon added 2 commits January 9, 2024 13:22

Merge branch 'main' into dev/rybrande/NugetJsonParse

379eedf

Merge branch 'main' into dev/rybrande/NugetJsonParse

d28427b

bdragon merged commit 763f444 into main Jan 9, 2024
63 checks passed

bdragon deleted the dev/rybrande/NugetJsonParse branch January 9, 2024 23:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NuGet efficiency #8679

NuGet efficiency #8679

ryanbrandenburg commented Dec 29, 2023

bdragon left a comment

ryanbrandenburg commented Jan 8, 2024

bdragon commented Jan 8, 2024 •

edited

Loading

ryanbrandenburg commented Jan 8, 2024

JoeRobich Jan 8, 2024

JoeRobich Jan 8, 2024

bdragon commented Jan 9, 2024

ryanbrandenburg commented Jan 9, 2024

ryanbrandenburg commented Jan 9, 2024

bdragon commented Jan 9, 2024

NuGet efficiency #8679

NuGet efficiency #8679

Conversation

ryanbrandenburg commented Dec 29, 2023

bdragon left a comment

Choose a reason for hiding this comment

ryanbrandenburg commented Jan 8, 2024

bdragon commented Jan 8, 2024 • edited Loading

ryanbrandenburg commented Jan 8, 2024

JoeRobich Jan 8, 2024

Choose a reason for hiding this comment

JoeRobich Jan 8, 2024

Choose a reason for hiding this comment

bdragon commented Jan 9, 2024

ryanbrandenburg commented Jan 9, 2024

ryanbrandenburg commented Jan 9, 2024

bdragon commented Jan 9, 2024

bdragon commented Jan 8, 2024 •

edited

Loading