Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poetry add/install not using venv python #3501

Closed
mattsta opened this issue Dec 19, 2020 · 12 comments
Closed

poetry add/install not using venv python #3501

mattsta opened this issue Dec 19, 2020 · 12 comments

Comments

@mattsta
Copy link

mattsta commented Dec 19, 2020

Ran into a bit of a python environment error when trying to install unruly packages (where they explicitly import dependency modules in their setup.py to see if it exists).

The current poetry environment package install process uses the python executable of whatever is running poetry and never actually the python of the venv itself.

with temporary_directory() as tmp_dir:
# TODO: cache PEP 517 build environment corresponding to each project venv
venv_dir = Path(tmp_dir) / ".venv"
EnvManager.build_venv(venv_dir.as_posix())

Because here, if no executable argument is provided to build_venv(), it defaults to the executable running poetry—could be anything anywhere with any packages!—(via executable or sys.executable). If we are using an explicit env set by poetry env use which isn't the same binary running poetry itself, the install process will never have a usable environment we expect:

poetry/poetry/utils/env.py

Lines 804 to 819 in bf30ca6

@classmethod
def build_venv(
cls, path, executable=None, flags=None
): # type: (Union[Path,str], Optional[Union[str, Path]], Dict[str, bool]) -> virtualenv.run.session.Session
flags = flags or {}
if isinstance(executable, Path):
executable = executable.resolve().as_posix()
args = [
"--no-download",
"--no-periodic-update",
"--python",
executable or sys.executable,
str(path),
]

Proper suggested fix: use the current value of the project's env python setting for all installs. (left as an exercise for the reader)

Hack fix: change the executable discovery line to at least consult the environment for python before defaulting to the current executable: executable or shutil.which("python3") or sys.executable, so it will at least be able to install when attempted inside the project's own poetry shell session.

@finswimmer
Copy link
Member

Hello @mattsta,

what's your problem with the current behavior?

fin swimmer

@mattsta
Copy link
Author

mattsta commented Dec 19, 2020

short version:

(where they explicitly import dependency modules in their setup.py to see if it exists).

longer version:
Basically, some packages will only install if they have their dependencies available at build time. So, the system python where poetry is installed will never be able to complete the setup.py since the deps are installed in the poetry venv environment, not on the system python path where we're using poetry from.

On a system with 3+ python versions installed, always only using the poetry-system-script python for the temporary build-venv leads to confusion because the the project venv python (or even just the environment $PATH) seems to be never used for installing.

and yeah, the third party packages are doing bad things by directly importing dependence in their setup.py, but it is also slightly confusing how poetry builds using the python of the poetry script and not the python environment of the project venv itself (or even the python of the current shell $PATH).

@sinoroc
Copy link

sinoroc commented Dec 19, 2020

@mattsta Are those projects public (Github or anything like that)? I guess those projects need to use a pyproject.toml to specify build dependencies and clean the setup.py.

@mattsta
Copy link
Author

mattsta commented Dec 19, 2020

you'd think so, right?

But it still seems confusing to build packages using the python of poetry instead of the python of the environment (which could be multiple major releases different, maybe pyenv/conda has deployed 10 python versions on a system, etc). Unless I've been using poetry wrong, poetry isn't installed inside the project environment, it's installed on the system python, but builds should use the project environment (or, at a minimum the python from the $PATH), not whatever version/environment of python is #! in the poetry runner script?

Example live package with broken setup.py: https://github.com/facebookresearch/detectron2/blob/master/setup.py which fails doing poetry add git+https://github.com/facebookresearch/detectron2 because they are directly importing the torch dependency to do a "version check" and we can't hack around their build problem unless we pollute our system python with their dependency import package first.

@finswimmer
Copy link
Member

Hello @mattsta,

in your case poetry needs to build the package twice.

The first one is needed to fetch metadata like name, version and dependencies. poetry tries first to parse these information from the setup.py directly. Only if this fails, it build the wheel package to get the information from there. Usually these metadata should be same under every python version. (This might not always be true, but it's the best guess we can make). So there is no need to switch to the python version used in the venv.

The second one is needed to build the wheel and install it. Here poetry uses the python version it finds within the venv.

In both cases a new build environment is created. If no pyproject.toml is provided for the package (as @sinoroc suggested) the newly created build env just contains setuptools and wheel as package.

So as it stands now, poetry cannot do anything here. It doesn't have to do something with which python poetry uses. But there are things that the maintainers of the package must and can do:

  1. Provide a pyproject.toml that specify the build dependecies according to PEP 518. That's exactly the use case for that
  2. Define the metadata (name, version, dependencies and required python) in a way that can be parsed easily.

fin swimmer

@sinoroc
Copy link

sinoroc commented Dec 19, 2020

I don't understand that bit:

But it still seems confusing to build packages using the python of poetry instead of the python of the environment (which could be multiple major releases different, maybe pyenv/conda has deployed 10 python versions on a system, etc). Unless I've been using poetry wrong, poetry isn't installed inside the project environment, it's installed on the system python, but builds should use the project environment (or, at a minimum the python from the $PATH), not whatever version/environment of python is #! in the poetry runner script?

I do not know poetry's code base well enough. My gut feeling though, is that if it were the case the issue would show up much more often, and it would be a big problem in general. I have not managed to clearly identify what your proof is that poetry is not using an adequate Python interpreter for that step.

This is my debug attempt:

poetry add -vvv git+https://github.com/facebookresearch/detectron2.git
$ poetry add -vvv git+https://github.com/facebookresearch/detectron2.git
Using virtualenv: /home/sinoroc/.cache/pypoetry/virtualenvs/thing-uXt00XdK-py3.8
PackageInfo: PEP517 build failed: Command ['/tmp/tmpzsnzufm1/.venv/bin/python', '-'] errored with the following return code 1, and output: 
WARNING: You are using pip version 20.3.1; however, version 20.3.3 is available.
You should consider upgrading via the '/tmp/tmpzsnzufm1/.venv/bin/python -m pip install --upgrade pip' command.
Traceback (most recent call last):
  File "<stdin>", line 6, in <module>
  File "/tmp/tmpzsnzufm1/.venv/lib/python3.8/site-packages/pep517/meta.py", line 53, in build
    _prep_meta(hooks, env, dest)
  File "/tmp/tmpzsnzufm1/.venv/lib/python3.8/site-packages/pep517/meta.py", line 28, in _prep_meta
    reqs = hooks.get_requires_for_build_wheel({})
  File "/tmp/tmpzsnzufm1/.venv/lib/python3.8/site-packages/pep517/wrappers.py", line 160, in get_requires_for_build_wheel
    return self._call_hook('get_requires_for_build_wheel', {
  File "/tmp/tmpzsnzufm1/.venv/lib/python3.8/site-packages/pep517/wrappers.py", line 255, in _call_hook
    self._subprocess_runner(
  File "/tmp/tmpzsnzufm1/.venv/lib/python3.8/site-packages/pep517/wrappers.py", line 75, in quiet_subprocess_runner
    check_output(cmd, cwd=cwd, env=env, stderr=STDOUT)
  File "/usr/lib/python3.8/subprocess.py", line 411, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/tmp/tmpzsnzufm1/.venv/bin/python', '/tmp/tmpzsnzufm1/.venv/lib/python3.8/site-packages/pep517/_in_process.py', 'get_requires_for_build_wheel', '/tmp/tmpn_ipmtz7']' returned non-zero exit status 1.
input was : import pep517.build
import pep517.meta

path='/tmp/pypoetry-git-detectron2gwunvtzg'
system=pep517.build.compat_system(path)
pep517.meta.build(source_dir=path, dest='/tmp/tmpzsnzufm1/dist', system=system)


  Stack trace:

  10  ~/.pex/installed_wheels/126dc213d0bf69ccd3cd34ceea39c718e5570c87/clikit-0.6.2-py2.py3-none-any.whl/clikit/console_application.py:131 in run
       129│             parsed_args = resolved_command.args
       130│ 
     → 131│             status_code = command.handle(parsed_args, io)
       132│         except KeyboardInterrupt:
       133│             status_code = 1

   9  ~/.pex/installed_wheels/126dc213d0bf69ccd3cd34ceea39c718e5570c87/clikit-0.6.2-py2.py3-none-any.whl/clikit/api/command/command.py:120 in handle
       118│     def handle(self, args, io):  # type: (Args, IO) -> int
       119│         try:
     → 120│             status_code = self._do_handle(args, io)
       121│         except KeyboardInterrupt:
       122│             if io.is_debug():

   8  ~/.pex/installed_wheels/126dc213d0bf69ccd3cd34ceea39c718e5570c87/clikit-0.6.2-py2.py3-none-any.whl/clikit/api/command/command.py:171 in _do_handle
       169│         handler_method = self._config.handler_method
       170│ 
     → 171│         return getattr(handler, handler_method)(args, io, self)
       172│ 
       173│     def __repr__(self):  # type: () -> str

   7  ~/.pex/installed_wheels/a5dad1dbb8e8a2e3c5953d0d8981b5dfd4d855a6/cleo-0.8.1-py2.py3-none-any.whl/cleo/commands/command.py:92 in wrap_handle
        90│         self._command = command
        91│ 
     →  92│         return self.handle()
        93│ 
        94│     def handle(self):  # type: () -> Optional[int]

   6  ~/.pex/installed_wheels/9d866443fc540d58e14e5eba886f313291c0c3e3/poetry-1.1.4-py2.py3-none-any.whl/poetry/console/commands/add.py:107 in handle
       105│             return 0
       106│ 
     → 107│         requirements = self._determine_requirements(
       108│             packages,
       109│             allow_prereleases=self.option("allow-prereleases"),

   5  ~/.pex/installed_wheels/9d866443fc540d58e14e5eba886f313291c0c3e3/poetry-1.1.4-py2.py3-none-any.whl/poetry/console/commands/init.py:320 in _determine_requirements
       318│             return requires
       319│ 
     → 320│         requires = self._parse_requirements(requires)
       321│         result = []
       322│         for requirement in requires:

   4  ~/.pex/installed_wheels/9d866443fc540d58e14e5eba886f313291c0c3e3/poetry-1.1.4-py2.py3-none-any.whl/poetry/console/commands/init.py:410 in _parse_requirements
       408│                         pair["extras"] = extras
       409│ 
     → 410│                     package = Provider.get_package_from_vcs(
       411│                         "git", url.url, rev=pair.get("rev")
       412│                     )

   3  ~/.pex/installed_wheels/9d866443fc540d58e14e5eba886f313291c0c3e3/poetry-1.1.4-py2.py3-none-any.whl/poetry/puzzle/provider.py:202 in get_package_from_vcs
       200│             revision = git.rev_parse(reference, tmp_dir).strip()
       201│ 
     → 202│             package = cls.get_package_from_directory(tmp_dir, name=name)
       203│             package._source_type = "git"
       204│             package._source_url = url

   2  ~/.pex/installed_wheels/9d866443fc540d58e14e5eba886f313291c0c3e3/poetry-1.1.4-py2.py3-none-any.whl/poetry/puzzle/provider.py:285 in get_package_from_directory
       283│         cls, directory, name=None
       284│     ):  # type: (Path, Optional[str]) -> Package
     → 285│         package = PackageInfo.from_directory(path=directory).to_package(
       286│             root_dir=directory
       287│         )

   1  ~/.pex/installed_wheels/9d866443fc540d58e14e5eba886f313291c0c3e3/poetry-1.1.4-py2.py3-none-any.whl/poetry/inspection/info.py:541 in from_directory
       539│                         info = cls.from_setup_files(path)
       540│                     else:
     → 541│                         info = cls._pep517_metadata(path)
       542│                 except PackageInfoError:
       543│                     if not info:

  PackageInfoError

  Unable to determine package info for path: /tmp/pypoetry-git-detectron2gwunvtzg
  
  Fallback egg_info generation failed.
  
  Command ['/tmp/tmpzsnzufm1/.venv/bin/python', 'setup.py', 'egg_info'] errored with the following return code 1, and output: 
  Traceback (most recent call last):
    File "setup.py", line 10, in <module>
      import torch
  ModuleNotFoundError: No module named 'torch'

  at ~/.pex/installed_wheels/9d866443fc540d58e14e5eba886f313291c0c3e3/poetry-1.1.4-py2.py3-none-any.whl/poetry/inspection/info.py:502 in _pep517_metadata
      498│                 try:
      499│                     venv.run("python", "setup.py", "egg_info")
      500│                     return cls.from_metadata(path)
      501│                 except EnvCommandError as fbe:
    → 502│                     raise PackageInfoError(
      503│                         path, "Fallback egg_info generation failed.", fbe
      504│                     )
      505│                 finally:
      506│                     os.chdir(cwd.as_posix())

Seems fairly normal to me, poetry seems to use build isolation to try and build and/or get the metadata out of the cloned project directory. And it fails since there is no torch in that build environment.


I cloned the repository and then tried poetry add ../detectron2 and it gives the same error, as expected. So I added the following pyproject.toml and then poetry add ../detectron2 works without any issue:

[build-system]
build-backend = 'setuptools.build_meta'
requires = [
    'setuptools',
    'torch',
]

See facebookresearch/detectron2#510 for why adding a pyproject.toml to that repository might be problematic (although, pytorch is also being annoying for seemingly no insurmountable reason on that topic).

@mattsta
Copy link
Author

mattsta commented Dec 19, 2020

Usually these metadata should be same under every python version. (This might not always be true, but it's the best guess we can make). So there is no need to switch to the python version used in the venv.

but but but... using the project environment as the first build step seems reasonable, right? If there's no difference, then what's the harm in using the project's environment for both build stages?

So as it stands now, poetry cannot do anything here.

Using the currently activated project env to build dependencies seems logical, or at a minimum, consulting the current path for which python to use before falling back to forcing whichever python is running poetry would be the least surprising behavior.

@finswimmer
Copy link
Member

but but but... using the project environment as the first build step seems reasonable, right? If there's no difference, then what's the harm in using the project's environment for both build stages?

Building an external package should never happen within the current environment. Building the package can introduce new dependencies only for the build step, that you don't want to have in your environment. So whether you use the python version that's available within the venv or outside doesn't matter here. Because this python is only used to create a new temporary venv with no other packages available then setuptools and wheel.

@sinoroc
Copy link

sinoroc commented Dec 19, 2020

using the project environment as the first build step seems reasonable, right? If there's no difference, then what's the harm in using the project's environment for both build stages?

Using the currently activated project env to build dependencies seems logical, or at a minimum, consulting the current path for which python to use before falling back to forcing whichever python is running poetry would be the least surprising behavior.

I am not sure I follow. How would that help?

@mattsta
Copy link
Author

mattsta commented Dec 19, 2020

Building an external package should never happen within the current environment. Building the package can introduce new dependencies only for the build step, that you don't want to have in your environment.

ooh, got it. Thanks for the insight! Sorry for the mistaken assumptions.

I am not sure I follow. How would that help?

In this one case, it would let us override the initial setup.py evaluation environment with the current project environment already containing the torch dependency (which doesn't exist in the clean system environment, but a temporary hacked-together install is better than no install sometimes).

Closing as "notabug; other projects need to get their act together and stop breaking established norms of the language ecosystem!"

@mattsta mattsta closed this as completed Dec 19, 2020
@sinoroc
Copy link

sinoroc commented Dec 19, 2020

@mattsta
If I were you, I think maybe I would look into deploying a private local index (i.e. a PyPI alternative like devpi) and upload my own pre built wheels of those "difficult" projects (detectron2, pytorch, etc.) on it.

Copy link

github-actions bot commented Mar 2, 2024

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 2, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants