Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Performance of Picking Best Candidate from Indexes #9748

Merged
merged 12 commits into from
Apr 3, 2021
Merged

Improve Performance of Picking Best Candidate from Indexes #9748

merged 12 commits into from
Apr 3, 2021

Conversation

jbylund
Copy link
Contributor

@jbylund jbylund commented Mar 31, 2021

  • wheel.supported and wheel.support_index_min both currently depend on a linear scan of what can possibly be a large number of tags. Combining these into one step and pre-computing the {tag: index_in_list} dictionary makes it a lot less expensive to evaluate.

  • wheel.supported - if we happen to have the tags in a constant time lookup data structure doing the intersection might be nice, but if we're only using it once it may not make sense to do?

  • I was unable to get all tests to pass on master, it's likely I'm doing something wrong and there can maybe just be an edit to the readme

  • Is there a preferred method of profiling/a performance regression suite?

@jbylund jbylund marked this pull request as draft March 31, 2021 21:22
@ichard26
Copy link
Member

ichard26 commented Mar 31, 2021

FYI perf isn't a valid news type. Your choices are process|removal|feature|bugfix|doc|vendor|trivial. No idea which one would be ideal. Is there an issue you could point to? If so, bugfix might work. ex. PR #7962

@jbylund
Copy link
Contributor Author

jbylund commented Apr 1, 2021

Here's a very coarse benchmark:

On main:

	Command being timed: "bash -c for i in $(seq 10); do python -m pip wheel boto3; done"
	User time (seconds): 23.97
	System time (seconds): 0.59
	Percent of CPU this job got: 83%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:29.31

On this branch:

	Command being timed: "bash -c for i in $(seq 10); do python -m pip wheel boto3; done"
	User time (seconds): 19.61
	System time (seconds): 0.53
	Percent of CPU this job got: 87%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:23.02

@jbylund
Copy link
Contributor Author

jbylund commented Apr 1, 2021

Profile of 30 calls of find_best_candidate of boto3 from main (triggered by doing python -m pip wheel boto3, instrumentation on just package_finder.find_best_candidate):
image

The same from the branch:
image

@jbylund jbylund marked this pull request as ready for review April 1, 2021 16:10
@jbylund jbylund changed the title WIP: Performance Related Changes for Candidate Finding Performance Related Changes for Candidate Finding Apr 1, 2021
@uranusjr
Copy link
Member

uranusjr commented Apr 2, 2021

The logic looks good to me, awesome work! I feel the attribute and function for this can be named a bit better though, something like _wheel_tag_preferences and find_most_preferred_tag. A comment explaining lower means more preferred would be nice as well.

@uranusjr
Copy link
Member

uranusjr commented Apr 2, 2021

Re: news type, I think feature is more appropriate since this is arguably not a bug. The text can be slightly more specific as well, something like Improve performance when picking the best file from indexes during pip install.

@jbylund jbylund marked this pull request as draft April 2, 2021 18:48
@jbylund
Copy link
Contributor Author

jbylund commented Apr 2, 2021

Should I add a comment in the __init__ of CandidateEvaluator explaining why the thing exists or is it clear enough?

@jbylund jbylund marked this pull request as ready for review April 2, 2021 19:35
news/9748.feature.rst Outdated Show resolved Hide resolved
@uranusjr
Copy link
Member

uranusjr commented Apr 2, 2021

Should I add a comment in the __init__ of CandidateEvaluator

A comment near _wheel_tag_preferences definition telling the user to read docstring of find_most_preferred_tag would be useful.

Co-authored-by: Tzu-ping Chung <[email protected]>
@jbylund jbylund changed the title Performance Related Changes for Candidate Finding Improve Performance of Picking Best Candidate from Indexes Apr 2, 2021
Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super neat! ^>^

@pradyunsg
Copy link
Member

FYI: I've cancelled the CI run for one of the older commits.

@uranusjr
Copy link
Member

uranusjr commented Apr 3, 2021

🚢

@uranusjr uranusjr merged commit d2c280b into pypa:main Apr 3, 2021
@pradyunsg
Copy link
Member

Thanks @jbylund! ^.^

inmantaci added a commit to inmanta/inmanta-core that referenced this pull request Apr 27, 2021
Bumps [pip](https://github.com/pypa/pip) from 21.0.1 to 21.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p>
<blockquote>
<h1>21.1 (2021-04-24)</h1>
<h2>Process</h2>
<ul>
<li>Start installation scheme migration from <code>distutils</code> to <code>sysconfig</code>. A
warning is implemented to detect differences between the two implementations to
encourage user reports, so we can avoid breakages before they happen.</li>
</ul>
<h2>Features</h2>
<ul>
<li>Add the ability for the new resolver to process URL constraints. (<code>[#8253](pypa/pip#8253) &lt;https://github.com/pypa/pip/issues/8253&gt;</code>_)</li>
<li>Add a feature <code>--use-feature=in-tree-build</code> to build local projects in-place
when installing. This is expected to become the default behavior in pip 21.3;
see <code>Installing from local packages &lt;https://pip.pypa.io/en/stable/user_guide/#installing-from-local-packages&gt;</code>_
for more information. (<code>[#9091](pypa/pip#9091) &lt;https://github.com/pypa/pip/issues/9091&gt;</code>_)</li>
<li>Bring back the &quot;(from versions: ...)&quot; message, that was shown on resolution failures. (<code>[#9139](pypa/pip#9139) &lt;https://github.com/pypa/pip/issues/9139&gt;</code>_)</li>
<li>Add support for editable installs for project with only setup.cfg files. (<code>[#9547](pypa/pip#9547) &lt;https://github.com/pypa/pip/issues/9547&gt;</code>_)</li>
<li>Improve performance when picking the best file from indexes during <code>pip install</code>. (<code>[#9748](pypa/pip#9748) &lt;https://github.com/pypa/pip/issues/9748&gt;</code>_)</li>
<li>Warn instead of erroring out when doing a PEP 517 build in presence of
<code>--build-option</code>. Warn when doing a PEP 517 build in presence of
<code>--global-option</code>. (<code>[#9774](pypa/pip#9774) &lt;https://github.com/pypa/pip/issues/9774&gt;</code>_)</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Fixed <code>--target</code> to work with <code>--editable</code> installs. (<code>[#4390](pypa/pip#4390) &lt;https://github.com/pypa/pip/issues/4390&gt;</code>_)</li>
<li>Add a warning, discouraging the usage of pip as root, outside a virtual environment. (<code>[#6409](pypa/pip#6409) &lt;https://github.com/pypa/pip/issues/6409&gt;</code>_)</li>
<li>Ignore <code>.dist-info</code> directories if the stem is not a valid Python distribution
name, so they don't show up in e.g. <code>pip freeze</code>. (<code>[#7269](pypa/pip#7269) &lt;https://github.com/pypa/pip/issues/7269&gt;</code>_)</li>
<li>Only query the keyring for URLs that actually trigger error 401.
This prevents an unnecessary keyring unlock prompt on every pip install
invocation (even with default index URL which is not password protected). (<code>[#8090](pypa/pip#8090) &lt;https://github.com/pypa/pip/issues/8090&gt;</code>_)</li>
<li>Prevent packages already-installed alongside with pip to be injected into an
isolated build environment during build-time dependency population. (<code>[#8214](pypa/pip#8214) &lt;https://github.com/pypa/pip/issues/8214&gt;</code>_)</li>
<li>Fix <code>pip freeze</code> permission denied error in order to display an understandable error message and offer solutions. (<code>[#8418](pypa/pip#8418) &lt;https://github.com/pypa/pip/issues/8418&gt;</code>_)</li>
<li>Correctly uninstall script files (from setuptools' <code>scripts</code> argument), when installed with <code>--user</code>. (<code>[#8733](pypa/pip#8733) &lt;https://github.com/pypa/pip/issues/8733&gt;</code>_)</li>
<li>New resolver: When a requirement is requested both via a direct URL
(<code>req @ URL</code>) and via version specifier with extras (<code>req[extra]</code>), the
resolver will now be able to use the URL to correctly resolve the requirement
with extras. (<code>[#8785](pypa/pip#8785) &lt;https://github.com/pypa/pip/issues/8785&gt;</code>_)</li>
<li>New resolver: Show relevant entries from user-supplied constraint files in the
error message to improve debuggability. (<code>[#9300](pypa/pip#9300) &lt;https://github.com/pypa/pip/issues/9300&gt;</code>_)</li>
<li>Avoid parsing version to make the version check more robust against lousily
debundled downstream distributions. (<code>[#9348](pypa/pip#9348) &lt;https://github.com/pypa/pip/issues/9348&gt;</code>_)</li>
<li><code>--user</code> is no longer suggested incorrectly when pip fails with a permission
error in a virtual environment. (<code>[#9409](pypa/pip#9409) &lt;https://github.com/pypa/pip/issues/9409&gt;</code>_)</li>
<li>Fix incorrect reporting on <code>Requires-Python</code> conflicts. (<code>[#9541](pypa/pip#9541) &lt;https://github.com/pypa/pip/issues/9541&gt;</code>_)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/pip/commit/2b2a268d25963727c2a1c805de8f0246b9cd63f6"><code>2b2a268</code></a> Bump for release</li>
<li><a href="https://github.com/pypa/pip/commit/ea761a6575f37b90cf89035ee8be3808cf872184"><code>ea761a6</code></a> Update AUTHORS.txt</li>
<li><a href="https://github.com/pypa/pip/commit/2edd3fdf2af2f09dce5085ef0eb54684b4f9bc04"><code>2edd3fd</code></a> Postpone a deprecation to 21.2</li>
<li><a href="https://github.com/pypa/pip/commit/3cccfbf169bd35133ee25d2543659b9c1e262f8c"><code>3cccfbf</code></a> Rename mislabeled news fragment</li>
<li><a href="https://github.com/pypa/pip/commit/21cd124b5d40b510295c201b9152a65ac3337a37"><code>21cd124</code></a> Fix NEWS.rst placeholder position</li>
<li><a href="https://github.com/pypa/pip/commit/e46bdda9711392fec0c45c1175bae6db847cb30b"><code>e46bdda</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9827">#9827</a> from pradyunsg/fix-git-improper-tag-handling</li>
<li><a href="https://github.com/pypa/pip/commit/0e4938d269815a5bf1dd8c16e851cb1199fc5317"><code>0e4938d</code></a> 📰</li>
<li><a href="https://github.com/pypa/pip/commit/ca832b2836e0bffa7cf95589acdcd71230f5834e"><code>ca832b2</code></a> Don't split git references on unicode separators</li>
<li><a href="https://github.com/pypa/pip/commit/1320bac4ff80d76b8fba2c8b4b4614a40fb9c6c3"><code>1320bac</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9814">#9814</a> from pradyunsg/revamp-ci-apr-2021-v2</li>
<li><a href="https://github.com/pypa/pip/commit/e9cc23ffd97cb6d66d32dc3ec27cf832524bb33d"><code>e9cc23f</code></a> Skip checks on PRs only</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/21.0.1...21.1">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=21.0.1&new-version=21.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

</details>
inmantaci added a commit to inmanta/inmanta-core that referenced this pull request May 18, 2021
Bumps [pip](https://github.com/pypa/pip) from 21.0.1 to 21.1.1.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/pypa/pip/blob/main/NEWS.rst">pip's changelog</a>.</em></p>
<blockquote>
<h1>21.1.1 (2021-04-30)</h1>
<h2>Deprecations and Removals</h2>
<ul>
<li>Temporarily set the new &quot;Value for ... does not match&quot; location warnings level
to <em>DEBUG</em>, to hide them from casual users. This prepares pip 21.1 for CPython
inclusion, while pip maintainers digest the first intake of location mismatch
issues for the <code>distutils</code>-<code>sysconfig</code> transition. (<code>[#9912](pypa/pip#9912) &lt;https://github.com/pypa/pip/issues/9912&gt;</code>_)</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>This change fixes a bug on Python <!-- raw HTML omitted -->`_)</li>
<li>Fix compatibility between distutils and sysconfig when the project name is unknown outside of a virtual environment. (<code>[#9838](pypa/pip#9838) &lt;https://github.com/pypa/pip/issues/9838&gt;</code>_)</li>
<li>Fix Python 3.6 compatibility when a PEP 517 build requirement itself needs to be
built in an isolated environment. (<code>[#9878](pypa/pip#9878) &lt;https://github.com/pypa/pip/issues/9878&gt;</code>_)</li>
</ul>
<h1>21.1 (2021-04-24)</h1>
<h2>Process</h2>
<ul>
<li>Start installation scheme migration from <code>distutils</code> to <code>sysconfig</code>. A
warning is implemented to detect differences between the two implementations to
encourage user reports, so we can avoid breakages before they happen.</li>
</ul>
<h2>Features</h2>
<ul>
<li>Add the ability for the new resolver to process URL constraints. (<code>[#8253](pypa/pip#8253) &lt;https://github.com/pypa/pip/issues/8253&gt;</code>_)</li>
<li>Add a feature <code>--use-feature=in-tree-build</code> to build local projects in-place
when installing. This is expected to become the default behavior in pip 21.3;
see <code>Installing from local packages &lt;https://pip.pypa.io/en/stable/user_guide/#installing-from-local-packages&gt;</code>_
for more information. (<code>[#9091](pypa/pip#9091) &lt;https://github.com/pypa/pip/issues/9091&gt;</code>_)</li>
<li>Bring back the &quot;(from versions: ...)&quot; message, that was shown on resolution failures. (<code>[#9139](pypa/pip#9139) &lt;https://github.com/pypa/pip/issues/9139&gt;</code>_)</li>
<li>Add support for editable installs for project with only setup.cfg files. (<code>[#9547](pypa/pip#9547) &lt;https://github.com/pypa/pip/issues/9547&gt;</code>_)</li>
<li>Improve performance when picking the best file from indexes during <code>pip install</code>. (<code>[#9748](pypa/pip#9748) &lt;https://github.com/pypa/pip/issues/9748&gt;</code>_)</li>
<li>Warn instead of erroring out when doing a PEP 517 build in presence of
<code>--build-option</code>. Warn when doing a PEP 517 build in presence of
<code>--global-option</code>. (<code>[#9774](pypa/pip#9774) &lt;https://github.com/pypa/pip/issues/9774&gt;</code>_)</li>
</ul>
<h2>Bug Fixes</h2>
<ul>
<li>Fixed <code>--target</code> to work with <code>--editable</code> installs. (<code>[#4390](pypa/pip#4390) &lt;https://github.com/pypa/pip/issues/4390&gt;</code>_)</li>
<li>Add a warning, discouraging the usage of pip as root, outside a virtual environment. (<code>[#6409](pypa/pip#6409) &lt;https://github.com/pypa/pip/issues/6409&gt;</code>_)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/pypa/pip/commit/c53d88c4c374523425f4db6bef949090764465c0"><code>c53d88c</code></a> Bump for release</li>
<li><a href="https://github.com/pypa/pip/commit/4417e7f4bef2b2c70767c1dbfe72f82cc6b7b83f"><code>4417e7f</code></a> Update AUTHORS.txt</li>
<li><a href="https://github.com/pypa/pip/commit/0c29bfe48e650c5a428b77eba4af7f067c019cc8"><code>0c29bfe</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9912">#9912</a> from uranusjr/sysconfig-remove-warning-for-python-re...</li>
<li><a href="https://github.com/pypa/pip/commit/f56ec327b92ebe836e63e07cb2449a20e09dec38"><code>f56ec32</code></a> Make location mismatch messages DEBUG level</li>
<li><a href="https://github.com/pypa/pip/commit/999b121402302a00b235a0d443f5661b69d6fd60"><code>999b121</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9883">#9883</a> from uranusjr/isolated-pip-py36-compat</li>
<li><a href="https://github.com/pypa/pip/commit/f88420319db450aefbed1500f04e31be46874aaf"><code>f884203</code></a> Fallback to self-invoke via directory on 3.6</li>
<li><a href="https://github.com/pypa/pip/commit/7a77484a492c8f1e1f5ef24eaf71a43df9ea47eb"><code>7a77484</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9835">#9835</a> from jamescurtin/9831-bugfix</li>
<li><a href="https://github.com/pypa/pip/commit/914bcc3dba88179f4e88ce90b63660474ba795cd"><code>914bcc3</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/pypa/pip/issues/9838">#9838</a> from uranusjr/sysconfig-header-with-none-project</li>
<li><a href="https://github.com/pypa/pip/commit/2a009a0b8a5d8d03117897f0f11f9c1dcf2a4b5a"><code>2a009a0</code></a> Better explanatory comment</li>
<li><a href="https://github.com/pypa/pip/commit/e7b1722efeaf4ff403142847ce1b52caedd5efcd"><code>e7b1722</code></a> Set dist_name to UNKNOWN when empty outside venv</li>
<li>Additional commits viewable in <a href="https://github.com/pypa/pip/compare/21.0.1...21.1.1">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pip&package-manager=pip&previous-version=21.0.1&new-version=21.1.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

</details>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2021
@ichard26 ichard26 added the type: performance Commands take too long to run label Dec 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants