Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support only re-creating session venv if poetry.lock has changed #1063

Open
johnthagen opened this issue Jan 27, 2023 · 5 comments
Open

Support only re-creating session venv if poetry.lock has changed #1063

johnthagen opened this issue Jan 27, 2023 · 5 comments

Comments

@johnthagen
Copy link
Contributor

nox-poetry currently by default always recreates the virtual environment on every invocation of the session. This has the advantage of ensuring that the environment is always up to date, but at the cost that environments will be recreated unnecessarily when there are no changes to the locked dependencies.

Consider this small example project:

[tool.poetry]
name = "poetry-test"
version = "0.1.0"
description = ""
authors = []

[tool.poetry.dependencies]
python = "^3.10, <3.12"

[tool.poetry.group.dev.dependencies]
nox-poetry = "*"

[tool.poetry.group.lint.dependencies]
flake8 = "*"
flake8-bugbear = "*"
flake8-broken-line = "*"
flake8-comprehensions = "*"
pep8-naming = "*"
flake8-pyproject = "*"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
# noxfile.py
from nox_poetry import Session, session


@session
def lint(s: Session) -> None:
    s.install(
        "flake8",
        "flake8-bugbear",
        "flake8-broken-line",
        "flake8-comprehensions",
        "pep8-naming",
        "flake8-pyproject"
    )
    s.run("flake8")

Running a simple lint invocation recreates the environment every time. Even when the wheels are cached on the user's system, the recreation can take significant time and add delay into the developer's workflow:

$ time nox -s lint
nox > Running session lint
nox > Creating virtual environment (virtualenv) using python in .nox/lint
nox > poetry export --format=requirements.txt --dev --without-hashes
The `--dev` option is deprecated, use the `--with dev` notation instead.
nox > python -m pip install --constraint=.nox/lint/tmp/requirements.txt flake8 flake8-bugbear flake8-broken-line flake8-comprehensions pep8-naming flake8-pyproject
nox > flake8 
nox > Session lint was successful.
nox -s lint  16.02s user 2.58s system 90% cpu 20.528 total

On a Macbook Pro, it took 16 seconds to recreate this relatively small venv.

If we were to reuse it:

time nox --reuse-existing-virtualenvs -s lint
nox > Running session lint
nox > Re-using existing virtual environment at .nox/lint.
nox > python -m pip install --constraint=.nox/lint/tmp/requirements.txt flake8 flake8-bugbear flake8-broken-line flake8-comprehensions pep8-naming flake8-pyproject
nox > flake8 
nox > Session lint was successful.
nox --reuse-existing-virtualenvs -s lint  0.90s user 0.29s system 96% cpu 1.227 total

Only 0.9 seconds.

nox-poetry has an advantage over Nox in that we know that the user is using Poetry. Poetry includes a helpful hash of the entire lockfile and important Poetry-related metadata from pyproject.toml.

# poetry.lock
[metadata]
lock-version = "2.0"
python-versions = "^3.10, <3.12"
content-hash = "f8d0beddbcf539e1dcd2e67603593a3c1b19a31df9b295b066d498110aa03b51"

If the content-hash were read using tomlkit (which is already a dependency of nox-poetry) and stored with each venv, then nox-poetry could very quickly check if a virtual environment needed to be recreated or not. This would speed up local development using nox-poetry greatly.

One open question could be, do we want to also try to handle the case when there are changes to the session itself? For example if different packages were passed to Session.install()? Perhaps there would be a simple way to add that with the lockfile hash and then it would be possible to skip the entire install step.

This can be demonstrated with this edit to the session:

from nox_poetry import Session, session


@session(venv_backend="none")
def lint(s: Session) -> None:
    s.run("flake8")

Avoiding the install entirely speeds up the command even more (because we are skipping the pip install --contraint... step which still needs to validate that the venv has all of the needed requirements.

time nox -s lint
nox > Running session lint
nox > flake8 
nox > Session lint was successful.
nox -s lint  0.30s user 0.12s system 96% cpu 0.436 total
@johnthagen
Copy link
Contributor Author

For previous art in this area, Tox 4 added support for the equivalent of this for requirements.txt files:

An older Tox plugin

@johnthagen
Copy link
Contributor Author

johnthagen commented Feb 8, 2023

Upon closer inspection, content-hash only covers the changes in pyproject.toml

So for nox-poetry, we'd want to SHA256 hash the full poetry.lock file and store/compare that as any changes to the locked dependencies also need to accounted for.

What's convenient is that because the pyproject.toml content-hash exists, we don't need to worry about pyproject.toml directly as we should be able to assume the user is keeping their lockfile fresh (using something like poetry lock --check) or else locking isn't defined any longer.

@johnthagen johnthagen changed the title Support only re-creating session venv if Poetry lock hash has changed Support only re-creating session venv if poetry.lcok has changed Feb 8, 2023
@johnthagen johnthagen changed the title Support only re-creating session venv if poetry.lcok has changed Support only re-creating session venv if poetry.lock has changed Feb 8, 2023
@cjolowicz
Copy link
Owner

I'm not sure I understand why you don't just use nox -r? If you want to avoid even the install commands, use nox -R instead.

Is your suggestion that nox-poetry keep track of the content-hash and imply one of those flags if that hash doesn't change? I would argue that "explicit is better than implicit". I would also consider this out of scope for nox-poetry--we're just the minimal glue code for Nox to make the installations respect Poetry's lock files.

@johnthagen
Copy link
Contributor Author

Is your suggestion that nox-poetry keep track of the content-hash and imply one of those flags if that hash doesn't change?

Yes.

Let me try to explain the motivation for this:

  1. Make running repeated Nox sessions as fast possible (by avoiding rebuilding venvs on every invocation)
  2. Avoid accidentally running a Nox session with a stale venv that doesn't have the correct dependencies installed

In the base Nox configuration, these two goals are at odds. You can set nox.options.reuse_existing_virtualenvs = True so that the commands run instantly and skip the install step, but this has the risk that if your lock dependencies change, you could be running with an incorrect/stale venv.

In practice, this happens on projects with large teams. It's not always obvious to know/remember to use -r or -R if other team members are merging PRs that might change dependencies. You could also be working on 5-10 different Poetry/Nox projects.

The rationale for the speed, we use Nox to provide a simple way to run tools such as black, isort, flake8, mypy, etc. The time for Nox to recreate a venv and install dependencies into it can often times dwarf the time of actually running the tools themselves, leading to a less-than-ideal UX for the developer.

With Poetry, we have at least the ability to do what Tox does and smartly recreate the venv only when needed, based on hashing poetry.lock. The Tox team added this feature for similar reasons listed above.

I would also consider this out of scope for nox-poetry

I can certainly understand/appreciate this! Perhaps a new/different Nox/poetry tool could take this on.

@cjolowicz
Copy link
Owner

You can set nox.options.reuse_existing_virtualenvs = True so that the commands run instantly and skip the install step, but this has the risk that if your lock dependencies change, you could be running with an incorrect/stale venv.

Nox doesn't skip the install step if you set this option. You need to pass --no-install or -R for that.

I don't recommend defaulting to venv reuse in the noxfile; it should be an explicit decision. Tox has a different model here AFAIU--they cache by default.

Generally, I'd recommend avoiding reuse in CI, as well as locally when you start new work or sync with origin. Then, as you iterate on your work locally, you can choose to speed up things with nox -R--you'll know if you've touched the dependencies.

If you want automation to enforce correct practice across large teams, how about doing the work yourself in the noxfile.py? You could read the current and previous content-hash from poetry.lock and a local cache, and set reuse_existing_virtualenvs based on the outcome.

I can certainly understand/appreciate this! Perhaps a new/different Nox/poetry tool could take this on.

As nox-poetry author, I have zero issues with this. From the point of view of Nox, it's not ideal though. It would be better if people used the flexibility of noxfile.py before creating more inofficial wrappers. If people really need more flexibility than that, maybe Nox can add official extension points. Also, I'd hope that eventually a PyPA lockfile standard could improve this whole situation, and we'll no longer need tool-specific workarounds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants