Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include backtracking in user guide #9040

Merged
merged 14 commits into from
Nov 14, 2020
223 changes: 223 additions & 0 deletions docs/html/user_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1276,6 +1276,8 @@ In this situation, you could consider:
- Refactoring your project to reduce the number of dependencies (for
example, by breaking up a monolithic code base into smaller pieces)

.. _`Getting help`:

Getting help
------------

Expand Down Expand Up @@ -1304,6 +1306,227 @@ issue tracker`_ if you believe that your problem has exposed a bug in pip.
.. _"How do I ask a good question?": https://stackoverflow.com/help/how-to-ask
.. _pip issue tracker: https://github.com/pypa/pip/issues

.. _`Dependency resolution backtracking`:

Dependency resolution backtracking
==================================

Or more commonly known as *"Why does pip download multiple versions of
the same package over and over again during an install?"*.

The purpose of this section is to provide explanation of why
backtracking happens, and practical suggestions to pip users who
encounter it during a ``pip install``.

What is backtracking?
---------------------

Backtracking is not a bug, or an unexpected behaviour. It is part of the
way pip's dependency resolution process works.

During a pip install (e.g. ``pip install tea``), pip needs to work out
the package's dependencies (e.g. ``spoon``, ``hot-water``, ``cup`` etc), the
versions of each of these packages it needs to install. For each package
pip needs to decide which version is a good candidate to install.

A "good candidate" means a version of each package that is compatible with all
the other package versions being installed at the same time.

In the case where a package has a lot of versions, arriving at a good
candidate can take a lot of time. (The amount of time depends on the
package size, the number of versions pip must try, and other concerns.)

How does backtracking work?
^^^^^^^^^^^^^^^^^^^^^^^^^^^

When doing a pip install, pip starts by making assumptions about the
packages it needs to install. During the install process it needs to check these
assumptions as it goes along.

When pip finds that an assumption is incorrect, it has to try another approach
(backtrack), which means discarding some of the work that has already been done,
and going back to choose another path.

For example; The user requests ``pip install tea``. ```tea`` has dependencies of
``cup``, ``hot-water``, ``spoon`` amongst others.

pip starts by installing a version of ``cup``. If it finds out it isn’t
compatible (with the other package versions) it needs to “go back”
(backtrack) and download an older version.

It then tries to install that version. If it is successful, it will continue
onto the next package. If not it will continue to backtrack until it finds a
compatible version.

This backtrack behaviour can end in 2 ways - either 1) it will
successfully find a set of packages it can install (good news!), or 2) it will
eventually display a `resolution impossible <https://pip.pypa.io/en/latest/user_guide/#id35>`__ error
message (not so good).

If pip starts backtracking during dependency resolution, it does not
know how long it will backtrack, and how much computation would be
needed. For the user this means it can take a long time to complete.

Why does backtracking occur?
----------------------------

With the release of the new resolver (:ref:`Resolver changes 2020`), pip is now
more strict in the package versions it installs when a user runs a
``pip install`` command.

Pip needs to backtrack because initially, it doesn't have all the information it
needs to work out the correct set of packages. This is because package indexes
don't provide full package dependency information before you have downloaded
the package.

This new resolver behaviour means that pip works harder to find out which
version of a package is a good candidate to install. It reduces the risk that
installing a new package will accidentally break an existing installed package,
and so reduces the risk that your environment gets messed up.

What does this behaviour look like?
-----------------------------------

Right now backtracking behaviour looks like this:

::

$ pip install tea==1.9.8
Collecting tea==1.9.8
Downloading tea-1.9.8-py2.py3-none-any.whl (346 kB)
|████████████████████████████████| 346 kB 10.4 MB/s
Collecting spoon==2.27.0
Downloading spoon-2.27.0-py2.py3-none-any.whl (312 kB)
|████████████████████████████████| 312 kB 19.2 MB/s
Collecting hot-water>=0.1.9
Downloading hot-water-0.1.13-py3-none-any.whl (9.3 kB)
Collecting cup>=1.6.0
Downloading cup-3.22.0-py2.py3-none-any.whl (397 kB)
|████████████████████████████████| 397 kB 28.2 MB/s
INFO: pip is looking at multiple versions of this package to determine
which version is compatible with other requirements.
This could take a while.
Downloading cup-3.21.0-py2.py3-none-any.whl (395 kB)
|████████████████████████████████| 395 kB 27.0 MB/s
Downloading cup-3.20.0-py2.py3-none-any.whl (394 kB)
|████████████████████████████████| 394 kB 24.4 MB/s
Downloading cup-3.19.1-py2.py3-none-any.whl (394 kB)
|████████████████████████████████| 394 kB 21.3 MB/s
Downloading cup-3.19.0-py2.py3-none-any.whl (394 kB)
|████████████████████████████████| 394 kB 26.2 MB/s
Downloading cup-3.18.0-py2.py3-none-any.whl (393 kB)
|████████████████████████████████| 393 kB 22.1 MB/s
Downloading cup-3.17.0-py2.py3-none-any.whl (382 kB)
|████████████████████████████████| 382 kB 23.8 MB/s
Downloading cup-3.16.0-py2.py3-none-any.whl (376 kB)
|████████████████████████████████| 376 kB 27.5 MB/s
Downloading cup-3.15.1-py2.py3-none-any.whl (385 kB)
|████████████████████████████████| 385 kB 30.4 MB/s
INFO: pip is looking at multiple versions of this package to determine
which version is compatible with other requirements.
This could take a while.
Downloading cup-3.15.0-py2.py3-none-any.whl (378 kB)
|████████████████████████████████| 378 kB 21.4 MB/s
Downloading cup-3.14.0-py2.py3-none-any.whl (372 kB)
|████████████████████████████████| 372 kB 21.1 MB/s
Downloading cup-3.13.1-py2.py3-none-any.whl (381 kB)
|████████████████████████████████| 381 kB 21.8 MB/s
This is taking longer than usual. You might need to provide the
dependency resolver with stricter constraints to reduce runtime.
If you want to abort this run, you can press Ctrl + C to do so.
Downloading cup-3.13.0-py2.py3-none-any.whl (374 kB)

In the above sample output, pip had to download multiple versions of
package ``cup`` - cup-3.22.0 to cup-3.13.0 - to find a version that will be
compatible with the other packages - ``spoon``, ``hot-water``, etc.

These multiple ``Downloading cup-version`` lines show pip backtracking.

Possible ways to reduce backtracking occurring
----------------------------------------------

It's important to mention backtracking behaviour is expected during a
``pip install`` process. What pip is trying to do is complicated - it is
working through potentially millions of package versions to identify the
compatible versions.

There is no guaranteed solution to backtracking but you can reduce it -
here are a number of ways.

.. _1-allow-pip-to-complete-its-backtracking:

1. Allow pip to complete its backtracking
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In most cases, pip will complete the backtracking process successfully.
It is possible this could take a very long time to complete - this may
not be the preferred option.

However there is a possibility pip will not be able to find a set of
compatible versions.

If you'd prefer not to wait, you can interrupt pip (ctrl and c) and use
:ref:`Constraints Files`: to reduce the number of package versions it tries.

.. _2-reduce-the-versions-of-the-backtracking-package:

2. Reduce the number of versions pip will try to backtrack through
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If pip is backtracking more than you'd like, the next option is to
constrain the number of package versions it tries.

A first good candidate for this constraining is the package(s) it is
backtracking on (e.g. in the above example - ``cup``).

You could try:

``pip install tea "cup > 3.13"``

This will reduce the number of versions of ``cup`` it tries, and
possibly reduce the time pip takes to install.

There is a possibility that if you're wrong (in this case an older
version would have worked) then you missed the chance to use it. This
can be trial and error.

.. _3-use-constraint-files-or-lockfiles:

3. Use constraint files or lockfiles
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This option is a progression of 2 above. It requires users to know how
to inspect:

- the packages they're trying to install
- the package release frequency and compatibility policies
- their release notes and changelogs from past versions

During deployment, you can create a lockfile stating the exact package and
version number for for each dependency of that package. You can create this
with `pip-tools <https://github.com/jazzband/pip-tools/>`__.

This means the "work" is done once during development process, and so
will save users this work during deployment.

The pip team is not available to provide support in helping you create a
suitable constraints file.

.. _4-be-more-strict-on-package-dependencies-during-development:

4. Be more strict on package dependencies during development
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For package maintainers during the development, give pip some help by
creating constraint files for the dependency tree. This will reduce the
number of versions it will try.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a later, followup PR, it would be nice for someone to expand this section and add more details.


Getting help
------------

If none of the suggestions above work for you, we recommend that you ask
for help. :ref:`Getting help`.

.. _`Using pip from your program`:

Using pip from your program
Expand Down
1 change: 1 addition & 0 deletions news/9039.doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add a section to the User Guide to cover backtracking during dependency resolution.