Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for uv.lock to support different index urls across different developer machines and CI environments #6349

Open
humanzz opened this issue Aug 21, 2024 · 20 comments
Labels
wish Not on the immediate roadmap

Comments

@humanzz
Copy link

humanzz commented Aug 21, 2024

Hello team,

Many thanks for the awesome work you're doing, and the countless amounts of time you've saved everyone using uv (and ruff).

I've seen the recent blog post about 0.3.0 and the accompanying documentation update, and from playing a bit to test commands like uv lock and uv sync, I actually have a question about uv.lock.

I've described the setup I use in #1710 (a proxy, and using invoke, etc.). That setup means that on different team member's machine's, each is likely to have a different INDEX_URL and even the CI/CD system we use would have a different one.

I've generated the uv.lock file using UV_INDEX_URL=... uv lock, and the main thing I noticed is that the index url is included in the lock file e.g.

[[package]]
name = "aws-embedded-metrics"
version = "3.2.0"
source = { registry = "http://<omitted>/pypi/simple" }
dependencies = [
    { name = "aiohttp" },
]
sdist = { url = "http://<omitted>/pypi/simple/aws-embedded-metrics/3.2.0/aws-embedded-metrics-3.2.0.tar.gz", hash = "sha256:f235f87ab25ff328f6f3afca1c6b3218e81eea6e96e6aee012d368bb813fae7b" }
wheels = [
    { url = "http://<omitted>/pypi/simple/aws-embedded-metrics/3.2.0/aws_embedded_metrics-3.2.0-py3-none-any.whl", hash = "sha256:887b76d24914efa5fc42a7b77983e77fc670633e6e1195aac7653c425fee7399" },
]

I have not tried using that file across our different machines yet, but I suspect it's going to cause problems so I wanted to confirm with you folks

  • What would happen if the index in the lock file is not available, and UV_INDEX_URL?
  • Is there a way to not include the full index url in the lock file?
@humanzz
Copy link
Author

humanzz commented Aug 24, 2024

I've tested it further and confirmed that uv sync would fail on the different machines given the UV_INDEX_URL= would be pointing to a different index url.

With that that, I think this is a request to enable not writing the index url into the lock files via uv lock, and to enable uv sync - and other commands to use the lock file that does not have index urls as part of the lockfile

and to be more specific, this is about

  • source.registry
  • sdist.url
  • wheels.url

fields

@humanzz humanzz changed the title Question about uv.lock, custom index urls, and its usage by the different commands Request to enable uv.lock to not include the index url Aug 24, 2024
@charliermarsh charliermarsh added the wish Not on the immediate roadmap label Aug 26, 2024
@mgab
Copy link

mgab commented Sep 2, 2024

I discovered uv recently and I'm pretty excited about it! So even if we're discussing issues, thanks! 🫶

Now, coming to the topic, I think this becomes important given the ability to setup index-url and extra-index-url as user-level configuration, and potentially as system-level configuration (#6742).

Right now if you have a user-level config pointing at a package repository that acts as proxy to PyPI (that might add private pacakges on top) you will record that url in any lock file you generate. So if you want to work on some public project where uv is used as pacakge manager, either

  1. the project -that you might not control- specifically defines index-url and extra-index-url to point at PyPI
  2. you disable the user-level config
  3. or you will generate lock files that people outside of your org will not be able to use

Option 2 might not seem that bad, but then that defies the purpose of being able to define user-level or system-level configs. Plus, it might not be the ideal setup ideal to take advantage of uvx and uv tool using private packages that are only available in the private repository.

@humanzz
Copy link
Author

humanzz commented Sep 4, 2024

Just adding a couple more comments

  • I did see on some other issue/pr, a mention that thanks to including those full paths, uv wouldn't be doing any further resolutions, and it just goes ahead to install things - which would save some time which sounds great
  • In the case of the setup we're using, I noticed something which might be useful to call out
    • when the index, is public pypi index i.e. https://pypi.org/simple, the package artifacts have urls with a prefix looking like https://files.pythonhosted.org/packages/c1/08/1bb1f7392cd9c3a65c4571a7f7286056ec52b3cd3d432485ca2b9907e0f9/. For example, for aws-embedded-metrics, it's https://files.pythonhosted.org/packages/c1/08/1bb1f7392cd9c3a65c4571a7f7286056ec52b3cd3d432485ca2b9907e0f9/aws-embedded-metrics-3.2.0.tar.gz
    • in my case, this custom index, which uses a local proxy on dev/ci machines i.e. http://<omitted>/pypi/simple, the package srtifacts urls follow a more deterministic url pattern e.g. http://<omitted>/pypi/simple/<package name>/<package version>/. The same aws-embedded-metrics would be http://<omitted>/pypi/simple/aws-embedded-metrics/3.2.0/aws-embedded-metrics-3.2.0.tar.gz.

This determinism, can actually lead to a simpler handling of this, where I tested with

  1. Generate the uv.lock with setting a UV_INDEX_URL to my custom index
  2. Used it in other machines, by running another small script, prior to uv sync, that modifies uv.lock to replace the index in http://<omitted>/pypi/simple with the content of UV_INDEX_URL

All of this, makes me wonder if there's some solution to cases like mine, along the lines of

  • Executing commands like uv lock, to have a flag to indicate usage of placeholder index urls, where it would have a place holder for UV_INDEX_URL or the extra indices
  • Executing commands like uv sync can take a similar flag, where it'd use the values passed through UV_INDEX_URL to resolve the placeholder values in uv.lock

@smheidrich
Copy link

smheidrich commented Sep 21, 2024

What follows is a +1 comment you should all ignore, but I'm writing it to include some keywords in this issue so other people who run into this can find it more easily than I just did (took me a while):


Having the same issue because I fetch all my PyPI packages through a local devpi(keyword 1) installation, so all source.registry and sdist.url URLs in uv.lock say http://localhost:3141/...(keyword 2) and, needless to say, this results in Connection refused(keyword 3) errors when trying to install the package elsewhere, e.g. CI systems such as GitHub Actions(keyword 4) or GitLab CI/CD(keyword 5).

@humanzz
Copy link
Author

humanzz commented Sep 21, 2024

The more comments this gets, the more I think there needs to be an option for the lock file to not include any index information.

This way, all those use cases that use local proxies, different on developer machines from those in CI environments, and the likelihood that those different proxies having different conventions for their asset (wheel/sdist) URLs, that a general simple solution to exclude those index/asset URLs would be very useful and enable them to adopt uv.lock

@tapetersen
Copy link

We could really also use this to be able to have lock-files without disclosing internal urls.
If not in the sync file then an option to the export subcommand to export a new uv.lock but with this or other things omitted would also be a good alternative

@charliermarsh
Copy link
Member

I'm not sure that a lockfile without URLs is a good idea. One of the goals of the lockfile is to be hermetic: you can install from just the lockfile without making any queries or relying on any external sources.

If you're going to omit source information like that, why not just export to requirements.txt?

@tapetersen
Copy link

@charliermarsh That may very well work for our requirements but it looks like the uv lockfile is richer (all the extras are declared). Otherwise it would just be a convenience to not have someone outside (that don't have access to internal dns-names) being able to use the lock-file without having to regenerate it from the requirements.txt (pardon me if I've missed a quicker way to handle that).

Basically my thought was that the hashes should be enough to guarantee that it's indeed the same packages that are being installed even if the urll has to be found again.

That said if it doesn't feel like the right way to do it and the recommended way is to manage that with requirements.txt exports we'll try that and raise a more specific issue or discussion if a showstopper or question regarding that comes up.

(feel free to edit/move/hide post if it derails original thread)

@humanzz
Copy link
Author

humanzz commented Sep 23, 2024

@charliermarsh

At the moment, I'm actually using uv pip compile pyproject.toml... and uv pip install -r ... to simulate having a lockfile.

The main thing I don't like about it, is keeping the lockfile up to date with pyproject.toml dependencies is all manual. I didn't want to always call uv pip compile, then follow immediately by uv pip install as that would (1) be slower (2) force updating resolved dependencies that are declared via version ranges.

So, I'd say that's not on par with the experience of something like package.json and package-lock.json for node, and I was hoping that uv.lock and the commands that maintain/use it are giving a better experience.

The other thing I like about uv.lock is uv.tool.dev-dependencies. Without that, I end up declaring those dependencies as optional dependencies which is really not right, but ended up doing it this way, so I can declare all my dependencies in pyproject.toml, and then generate the current requirements.txt "lockfile".

Omitting the URLs is one potential solution - me thinking out loud.

In the case of my proxy, given that the sdist/wheel URLs have a specific pattern with the index url being a prefix, I was able to run some code to manipulate the lock file before running uv sync --all-extras --frozen to install dependencies

@task
def update_lock_index_url(ctx):
    """
    Workaround for `uv lock` writing the index url to the lock file.
    Update `source` and `url` fields with values based on `UV_INDEX_URL` environment variable.
    """
    with open("uv.lock", "r+") as file:
        contents = file.read()
        pattern = r"http://127.0.0.1:\d+/[^/]+/[^/]+/[^/]+/pypi/simple"
        replacement = os.environ.get("UV_INDEX_URL", "NO_INDEX_OVERRIDE_SET")
        new_contents = re.sub(pattern, replacement, contents)
        file.seek(0)
        file.truncate()
        file.write(new_contents)

Downside for this, is that any dev on my team will end up having modifying uv.lock.
2 thoughts I had about this

  1. uv having this concept of a hook - some python code to execute to modify file content - before installing dependencies
  2. Ability to specify path to a lockfile, whereby I can run the hook myself, to generate a modified version of the lockfile, make sure it's in .gitignore, or in some tmp directory, and use uv sync while pointing to my modified version

I thought the simplest of all of the above would be this whole omission of URLs from the file tbh.
Moreover, generally when I think about what's uv story for supporting specifying index/extra index, and how that interacts with commands like uv sync, I think the strong assumption there is that the index URLs are not changing, which at least in my case is not true - whether across dev machines, or in the CI environment.

@mgab
Copy link

mgab commented Sep 30, 2024

I'm not sure that a lockfile without URLs is a good idea. One of the goals of the lockfile is to be hermetic: you can install from just the lockfile without making any queries or relying on any external sources.

If you're going to omit source information like that, why not just export to requirements.txt?

As far as I can see, poetry records the file name and the hash, which should be enough to ensure that you are getting exactly the same files without relying on external sources.

On the other hand, by including the URLs you are coupling the lock file with the specific path of the package repository that you are using. And while this might remove the need of interacting with the package index, it also seems that adding a proxy package repository on the working environment should be transparent to the dependency specification of my project.

@charliermarsh
Copy link
Member

I consider it fairly critical that the lockfile includes the URLs directly and I suspect that the lockfile standard (should it be accepted) will do the same. We do have a plan to support these use-cases though. After #7481, we'll add something like:

[[tool.uv.index]]
name = "private"
url = "https://private.org/simple"
proxy = "http://<omitted>/pypi/simple"

So you can set the index URL (which will appear in the lockfile) and then the proxy separately. You'll also be able to set the proxy as an environment variable, like UV_PRIVATE_PROXY_URL or similar.

@humanzz
Copy link
Author

humanzz commented Oct 4, 2024

Hey @charliermarsh,

Just to make sure I understand this, and try to assess whether it'd work.

  • The url in our case is not needed, though I dunno whether the intention is for it to be a valid index, or any url so maybe you can clarify that.
  • I guess the url is what would be saved in index? and the proxy is kinda what replaces it when doing the actual resolving/downloads?
  • for the package urls e.g. sdist, and wheels, how would that work with the proxy? In my earlier messages, I did see https://pypi.org/simple has sdist and wheels that don't have the index as part of their urls e.g. https://files.pythonhosted.org/packages/c1/08/1bb1f7392cd9c3a65c4571a7f7286056ec52b3cd3d432485ca2b9907e0f9/aws-embedded-metrics-3.2.0.tar.gz but in my case, it does have the index in the url e.g. http://<omitted>/pypi/simple/aws-embedded-metrics/3.2.0/aws-embedded-metrics-3.2.0.tar.gz

@charliermarsh
Copy link
Member

  1. Ideally it is a valid index and indicative of the actual locations of the files but that wouldn't be strictly required.
  2. Correct.
  3. Both the file URLs and the registry URLs would be "replaced" correctly as long as they're on the same domain. So this wouldn't work for PyPI but would work for your case.

@humanzz
Copy link
Author

humanzz commented Oct 5, 2024

Great... then I look forwards to that :)

@mgab
Copy link

mgab commented Oct 10, 2024

I consider it fairly critical that the lockfile includes the URLs directly and I suspect that the lockfile standard (should it be accepted) will do the same.

Sorry! Actually, reading about the proposed standard I found this comment that properly describes what was my concern, and then the answer to it. The point that I was missing is in fact addressed in the PEP draft:

  • Installers MUST NOT assume the URL will always work, but installers MAY use the URL if it happens to work.

So yeah, I understood that URLs would be assumed to work, and that didn't seem a great idea. I see that's not the case, so nothing to object!

What got me confused is that apparently uv sync does assume the URLs to work unless you explicitly configure a value for either index-url or extra-index-url somewhere. I reported it in #8076 and #8074, hope it's useful. In any case, that's just about keeping improving uv and not a design disagreement.

Thanks for the answers!

@humanzz
Copy link
Author

humanzz commented Oct 17, 2024

I saw that the revamped index support got released in 0.4.23.

Is there any separate issues to track the proxy/env variables support, or is this issue itself that?

@charliermarsh
Copy link
Member

I think it’s ok to track it here.

@humanzz humanzz changed the title Request to enable uv.lock to not include the index url Request to support uv.lock to not include the index url Oct 28, 2024
@humanzz humanzz changed the title Request to support uv.lock to not include the index url Request for uv.lock to support different index urls across different developer machines and CI environments Oct 28, 2024
@rd-danny-fleer
Copy link

rd-danny-fleer commented Dec 2, 2024

May I ask what's the current status of the issue? What aspects remain unclear?

Currently I use a workaround (Run uv lock once in the CICD pipeline with the alternative index URL and use the updated uv.lock file in the whole pipeline) to make CICD work with an alternative index. However, with this approach, you must be careful not to update any dependencies.

@charliermarsh
Copy link
Member

I don't know if anything is unclear, it's more that it hasn't been prioritized.

@dariocurr
Copy link

Any updates on this? This is critical for us to ship uv in our pipelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wish Not on the immediate roadmap
Projects
None yet
Development

No branches or pull requests

7 participants