Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure private pip repositories in the environment file #481

Conversation

jacksmith15
Copy link
Contributor

@jacksmith15 jacksmith15 commented Aug 9, 2023

Description

This PR adds support for specifying private pip repositories in the environment.yml file, similar to how channels are specified.

Similarly to channels, environment variables may be specified, and these will remain as references in the lockfile.

channels:
  - conda-forge
pip-repositories:
  - http://$PIP_USER:[email protected]/api/pypi/simple
dependencies:
  - python=3.11
  - requests=2.26
  - pip:
    - fake-private-package==1.0.0

This aims to solve #460, and is an alternative approach to the one proposed by #471

@netlify
Copy link

netlify bot commented Aug 9, 2023

Deploy Preview for conda-lock ready!

Name Link
🔨 Latest commit 49e624f
🔍 Latest deploy log https://app.netlify.com/sites/conda-lock/deploys/6501c88db222770008e27034
😎 Deploy Preview https://deploy-preview-481--conda-lock.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@@ -1,126 +1,14 @@
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this has been extracted to conda_lock/models/package_source.py, which contains the shared logic common to both channels and pip repositories.

Comment on lines 398 to 407
def _normalize_url(url: str, pip_repositories: Optional[List[PipRepository]] = None) -> str:
if not pip_repositories:
return url
for pip_repository in pip_repositories:
specified_url = urlparse(pip_repository.url)
repository_host = specified_url.scheme + "://" + specified_url.netloc
repository_host_expanded = expandvars(repository_host)
if url.startswith(repository_host_expanded):
url = url.replace(repository_host_expanded, repository_host, 1)
return url
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This differs from the way environment variables are re-substituted in conda channels. This is because generally speaking, most pip repositories link to files outside of the Python simple/ API path.

E.G. pypi/api/simple would be the configured path for resolution, but the resolved file URL might be pypi/<package-name>/<package-name>-1.0.0.tar.gz.

Because of this, we need to make some assumptions based on the netloc instead of full matches.

@maresb
Copy link
Contributor

maresb commented Aug 10, 2023

Hey, I took a look, and I like the direction this is heading, thanks so much for this!

I'm impressed by your PyPI mock. :)

I see that the first commit is primarily a combination of moving code from channel.py to package_source.py, pushing through the pip_repositories argument, and adding a tiny bit of code to make the pip_repositories argument work. But given the large subtractions and additions to channel.py and package_source.py, it took quite some effort to manually diff everything before I managed to deduce this. It would help me a lot with review if you put any subsequent cut-paste operations into their own commits so that I can distinguish between moved code and altered code.

Finally, feel free to be bold about refactoring and simplifying existing code if it helps. You're touching a lot of code that's probably long overdue for an overhaul. For example, I'd prefer to use libraries when it makes things easier.

@jacksmith15
Copy link
Contributor Author

jacksmith15 commented Aug 10, 2023

I note an issue with the current approach, which is unfortunately not detected by the test.

The poetry solver makes requests to the repository using requests, which appears to automatically swap URL-based Basic Auth with header-based. This is relatively sensible, however it does mean that this debug warning will almost always fire.

More annoyingly, the solver uses the URL from the response not the one used to make the request. This means that if the solver makes a request to https://username:[email protected]/simple/package, the resulting URL for the package returned by the solver will be https://private-pypi.org/simple/package, which obviously doesn't include the secrets, but also doesn't include the environment variables.

I can only think of one good solution here, which is to inspect each resolved package URL and attempt to match them to the configured pip-repositories, and then insert the basic auth back into the URL (as environment variables). The link objects returned by the solver can be used to match on repositories, because they include they include a reference to the Page and headers of the request.

I'll investigate this route.

📝 This might be more straightforward to implement (i.e. fewer edge cases, more predictable for users) if the configuration format were more restrictive. E.g.

pip-repositories:
    - url: https://private-pypi.org/pypi/simple
      basic-auth:
        username: $USERNAME
        password: $PASSWORD

@maresb
Copy link
Contributor

maresb commented Aug 10, 2023

One of my top priorities is to upgrade the vendored version of Poetry. I'm not sure whether or not this would help, but anyways I want to depend on Poetry as little as possible. But if it's tied into their solver I suppose we have no choice but to play along for now, since we can't modularize the solver overnight.

@mariusvniekerk
Copy link
Collaborator

We should also support parsing credentials dynamically out of netrc files (this works quite well for a lot of conda package registries)

Comment on lines +39 to +41
if not solver_url.startswith(self.stripped_base_url):
# The resolved package URL is at a different host to the repository
return solver_url
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting edge case.

Basically this is where the repository is at http://private-pypi.org, but that index directs the solver to some other host, e.g. http://packages.private-pypi.org or http://somewhere.completely.different.

Currently in this case the lockfile will just use the URL from the solver as-is, without any auth.

An alternative would be to just add the auth to the URL anyway? WDYT?

@jacksmith15 jacksmith15 marked this pull request as ready for review September 1, 2023 17:19
@jacksmith15 jacksmith15 requested a review from a team as a code owner September 1, 2023 17:19
@jacksmith15
Copy link
Contributor Author

@maresb @mariusvniekerk

I (finally) found enough time to get this in a working state, ready-for-review. Let me know what you think.

@maresb
Copy link
Contributor

maresb commented Sep 6, 2023

@jacksmith15 I'm doing a lot of traveling at the moment; this is in my review queue, but I'm still looking for a chunk of time when I can go through this carefully. Feel free to pester me, but I can't promise any timescale yet. Perhaps @mariusvniekerk could help if available.

@jacksmith15
Copy link
Contributor Author

@maresb I've re-opened this PR in #529 with up-to-date changes from main and a cleaner git history. Keen to get your thoughts.

@jacksmith15 jacksmith15 deleted the feat/460/private-pypi-repository-configuration branch October 14, 2023 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants