-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip fails to install multiple packages in single invocation if one depends on another #1386
Comments
this is not a true general statement. can you give a specific example of 2 pypi packages? as for your iterating deployment scripts, I'm skeptical you need to be doing that. can you explain? |
Sorry, I'll clarify my description. The problem occurs when the install/build of pkg-B depends I see this all the time with packages that depend on numpy and/or cython, both of In a fresh venv, this fails:
While this succeeds:
because bottleneck's setup.py requires numpy to be installed. For a similar but distinct failure case (starting with a fresh venv), do:
and then this fails
but this succeeds:
In this failed case, the build process (rather then setup.py) fails because numexpr Finally, any package
So that's 3 distinct examples of how pip's behavior fails installations unnecessarily, That should make it clear why the steps I described were necessary. I understand why pip would want to enforce all-or-nothing semantics on multiple package installation, AFAIK, pip doesn't offer a flag to turn this off. Is there? |
Just looking at bottleneck, it doesn't state that it install_requires numpy, just that it requires it. And yet it imports numpy in its setup.py. So isn't that the issue, that bottleneck is not specifying its install-time dependencies properly? |
Yes and no. There are valid reasons people avoid putting stuff in install_requires. Specifically Casually upgrading a package like numpy is a big no-no because:
IIRC, pip can be told not to do that but you have to spelunk the docs quite a bit to find out how Also, The user has implicitly given pip a dependency graph for build-time on the command line, pip could achieve all-or-nothing installs via rollback, or by doing a moist-run into a venv before |
The problem isn't the lack of install_requires. Pip attempts to discover all of the dependencies before it installs anything. It has to do this because it needs to know what to install. For instance you gave the example
Ok, so pip first goes out and discovers numpy, it finds it has versions A, B, and C. C is the latest so it selects that for install, then pip goes out and discovers bottleneck, however bottleneck (hypothetically) depends on numpy<C. Now pip goes and deselects C and instead selects B. (Note this isn't exactly how it works at the moment, but it's how it will in the future). The only way to make it work how you're proposing is to have it, instead of "select" C, "install" C. Then when it discovers that no it needs B not C, have it uninstall C, and then install B. This will be horrifically slow, especially for something like numpy. This isn't something I think that is appropiate when the actual underlying issue is that the way someone uses |
First, was I at least persuasive enough to merit a reopen? :) I'm not dictating a solution, you know pip and I only use it. But a re your points:
The case analysis for the current behavior is:
For the multiple concrete examples I've seen that doesn't look like an absurd compromise. Any reasonable way to address these concrete examples would be great. Since I'm not volunteering |
Also, re the slow case, what's the difference between:
and
in terms of the duration of the install process? Sorry about the verbosity, I'll try to be more concise for the rest of the thread... |
although pip doesn't have a true resolver (#988), the basic idea of resolving dependencies first, and installing once, is fundamental, and not something I can imagine changing to work around the cases you mention. the examples you cite all seem to be build-time dependency issues, i.e. pkgB needs pkgA even to build. the pip/setuptools solution right now is to be clear, conda hasn't solved this quagmire due to having better resolution logic (e.g. by having a SAT solver like you mention in #988), it solves it by being a "full-stack" management tool that manages all of the non-python dependencies required for any of it's packages, and it installs pre-built binaries. how might it be easier in the future for pip to install some of the projects you mention? Honestly, that's still TBD, and there's a fair amount of discussion on that now on distutils-sig. wheels and PEP426 will certainly be part of the answer. the reason I'm leaving this closed, is that the core of the issue isn't a problem with pip's install logic, but rather it's a much broader issue of managing external dependencies, which goes beyond pip itself. It might be worthwhile to leave a pip issue open that points to some of these ongoing discussions, so that new issues can be closed against that. |
To answer your later question, there wouldn't be any difference in length of time. However a proper dependency resolver (which pip doesn't have yet) would determine which version of numpy needed installed without installing it twice, which is where the "will take more time" comes from. To somewhat muddle this issue, there is the --no-deps flag to pip which should disable the dependency resolution all together and requires you to specify all the dependencies either on the command line or in a requirements.txt file. However this won't currently solve the numpy/bottleneck problems and such because as you noted pip doesn't really grok build time dependencies (leaving that up to the build tool itself, generally setuptools which uses setup_requires). Personally I'd need to think about this some more. This isn't well supported by any tool right now, and I'm not sure if there's a good way of kluding a method in that won't make things worse in the long term while we wait for the real solution. It's possible that with a little bit of tweaking the existing --no-deps flag can be made to work in a way that would support this. My fear there is that I believe isolated builds and a proper build time dependency specification that comes with PEP426/Metadata 2.0 is a better solution and that if we un-isolate the builds now we'll end up regretting it once PEP426 is in place. |
Reasonable. I hope these issues are tackled down the road as part of the shake-up in python packaging. Now, back to the iterating deployment scripts... |
I'm not sure if this is informative or helpful to anyone, but I came across this thread while trying to resolve a build-time dependency – a package which require another package in order to be set up. I tried using setup_requires, and it fails, because I need forked versions of both packages. I specify the git paths. Unfortunately, when building the dependent package, it seems to encounter the setup_requires, but resolves it using the broken one from pypi instead of my fork. I can't seem to find a way around it, so I'm going to just script installing the dependency first. |
Here is my requirements.sh utility script, that uses pip-compile + pip-sync + virtualenv Xargs can be used to install requirements.txt on a line by line basis if required
|
Related to #988, without dictating a solution.
If pkg-B depends on pkg-A, then:
pip install pkg-A pkg-B
fails.Same goes for installing from a properly ordered requirements.txt.
I've been working around this over and over again for a couple of
years now.
Specifically, In deployment scripts I've ended up iterate over
requirements.txt
filesline by line, running pip a dozen times or cluttering up the deployment scripts with multiple-chunk requirements files. The horror.
The text was updated successfully, but these errors were encountered: