Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: buildPyPIPackage and generated data #16005

Closed
wants to merge 2 commits into from
Closed

Conversation

FRidh
Copy link
Member

@FRidh FRidh commented Jun 5, 2016

Motivation for this change

In #15007 I presented a script to generate metadata from PyPI and a function to build Python packages using that metadata.

In this PR the metadata is no longer stored in Nixpkgs but in an external repository, https://github.com/FRidh/srcs-pypi.

Things done
  • Tested using sandboxing
    (nix.useSandbox on NixOS,
    or option build-use-sandbox in nix.conf
    on non-NixOS)
  • Built on platform(s)
    • NixOS
    • OS X
    • Linux
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

These two functions wrap respectively buildPythonPackage and buildPythonApplication, but use by default a repository with metadata retrieved from PyPI.

@FRidh
Copy link
Member Author

FRidh commented Jun 5, 2016

A the time of writing the generated JSON is 80 MB.

Example usage:

toolz = buildPyPIPackage {
  name = "toolz";
  buildInputs = with self; [ nose ];
  checkPhase = ''
    nosetests toolz/tests
  '';
};

By default it picks the latest version. Optionally a version is passed in. I suppose within Nixpkgs we would always be explicit about the versions.

@garbas
Copy link
Member

garbas commented Jun 6, 2016

💯 could we get that srcs-pypi repo into nixos organization? and change the name to nixpkgs-python? that would then be a good place to rethink how we do python packaging.

# Use src if given. Otherwise, pick the right version via PyPI.
src = attrs.src or attrs.srcs or (fetchurl data.versions.${version});

in buildPythonPackage ( attrs // {name = name + "-" + version; src=src; meta = pypimeta;} )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have to fix meta data appending here

@FRidh
Copy link
Member Author

FRidh commented Jun 21, 2016

@garbas I could rename it, but moving the repo is not up to me. By the way, I think we should keep the discussion in the main repo, that is, this one. That repo is just going to contain generated data, nothing more.

@FRidh
Copy link
Member Author

FRidh commented Jun 22, 2016

According to #16130 (comment) Hydra cannot build packages that use this because Hydra operates in restricted mode.

@FRidh
Copy link
Member Author

FRidh commented Jun 28, 2016

I've tested whether a Hydra would build packages that use buildPyPIPackage and it indeed doesn't. That is a pity.

@FRidh
Copy link
Member Author

FRidh commented Jul 3, 2016

By default it picks the latest version. Optionally a version is passed in. I suppose within Nixpkgs we would always be explicit about the versions.

The reason I mentioned this is because we have stable releases to consider.
PyPI also records and exposes the upload datetime of the archives. If we store that in our JSON, and we store in Nixpkgs the datetime of the stable release, then we can use that to determine the latest version. It will still be possible to change the version (e.g. for security updates) manually.

By not specifying explicit versions the nixpkgs repo will always remain up to date.

@FRidh FRidh mentioned this pull request Jul 7, 2016
9 tasks
@FRidh FRidh added the 2.status: merge conflict This PR has merge conflicts with the target branch label Sep 19, 2016
@FRidh FRidh mentioned this pull request Oct 13, 2016
7 tasks
FRidh added 2 commits October 14, 2016 09:44
These two functions wrap respectively buildPythonPackage and buildPythonApplication, but use by default a repository with metadata retrieved from PyPI.
@FRidh FRidh changed the base branch from master to staging October 14, 2016 07:45
@FRidh FRidh removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Oct 14, 2016
@Mic92
Copy link
Member

Mic92 commented Oct 15, 2016

If we do not fix the version and upgrade implicitly, would not a lot of applications break all the time?

@FRidh
Copy link
Member Author

FRidh commented Oct 15, 2016

In my experience with the Python packages we have in Nixpkgs, there's a handful of packages that we need to set the version of explicitly. Most others seem to work fine after (minor) upgrades. And if the update breaks some packages, I would consider that fine as well considering it is Nixpkgs unstable. The amount of time that could be saved would be enormous.

, ... } @ attrs:

let
pypi-src = fetchgit {
Copy link
Member

@Mic92 Mic92 Oct 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fetchgit required here?
The good thing about your github structure is, that it is pretty readable.
We could generate a compressed version of your directory structure for downloading automatically via travis-ci.
Maybe putting all packages with the same prefix into the same file and compress the whole thing with xz to reduce the overall size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fetchgit required here?

Since GitHub can produce archives for a specific revision we could use fetchurl. But, where we really want to go to is fetchTarbal` along with a sha256. That way, Hydra can evaluate it. Without fetchTarball + sha256 Hydra cannot evaluate it. That's the reason why this is on hold. See also my latest mail.

Another option to reduce the size is to work with two repositories, one that has a JSON for each package, and another one that has package names along with hashes of the JSON files. This repo we would then refer to from Nixpkgs. It would replace downloading a big file with (possibly many) tiny ones.

pypi = !(builtins.hasAttr "src" attrs || builtins.hasAttr "srcs" attrs);

# Use Meta data from PyPI
pypimeta = if pypi then data.meta else {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also pretty cool.

@Mic92
Copy link
Member

Mic92 commented Oct 27, 2016

Do you know, what nixexpr actually contains? Is it just a snapshot of nixpkgs? Would every nixos installation have to download your repo eventually or would be referenced pypi information already included in nixexpr?

@FRidh
Copy link
Member Author

FRidh commented Nov 17, 2016

@Mic92 nixexpr is indeed a snapshot. Every NixOS installation would have to download this repo in order to evaluate Python packages.

@FRidh
Copy link
Member Author

FRidh commented May 3, 2017

I think the way forward is an update script that updates pname, version, and sha256 in Nix expressions, so I am closing this.

@FRidh FRidh closed this May 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants