Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

texlive: transform into overridable scope and extract generic interface #250626

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

apfelkuchen6
Copy link
Contributor

Description of changes

This includes all the preparation neccessary to add texlive-2023 ( #236382 ) and builds upon #250621 and #248746.

Unfortunately I wasn't really able to divide this into smaller independent chunks that can be merged separately, but I think it should be possible to comprehend the changes when looking at them commit-by-commit.

This should also not cause any rebuilds.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 23.11 Release Notes (or backporting 23.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md

@github-actions github-actions bot added the 6.topic: TeX Issues regarding texlive and TeX in general label Aug 21, 2023
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux labels Aug 21, 2023
@xworld21
Copy link
Contributor

Before jumping into the details, the treewide bit is problematic – better to mark this PR as draft for the time being? I'd rather do a single treewide switch to buildEnv before 23.11, basically do only one user visible transition, unless we discover it's not materially feasible. In the meantime I'll do my best to understand what's going on with these splices, see how it plays with the multi-output changes.

@apfelkuchen6
Copy link
Contributor Author

apfelkuchen6 commented Aug 22, 2023

I fully agree that we should only do one treewide change once all the changes are in place. This treewide change should be done in a separate PR some time in the future.

The "treewide" commit in this PR just changes the things with mutual texlive dependencies (asymptote, biber, biber-ms) (I've added these to make biber = biber.override { texlive = self; } work, self doesn't have the compat-fixups applied yet). I can drop the changes to these programs and remove the overrides for now. As long as there is only one texlive, there should not be any problems.
Edit: done

@apfelkuchen6 apfelkuchen6 force-pushed the texlive-extract-generic branch 3 times, most recently from 95e7111 to 4833e38 Compare August 22, 2023 09:16
Comment on lines 162 to 147
applyOverScope = f: scope: f (scope // {
overrideScope = g: applyOverScope f (scope.overrideScope g);
});

# for backward compability
compatFixups = scope:
# TODO
scope.pkgs // # remove this line to fix cross
scope // {
bin = scope.bin // {
latexindent = lib.findFirst (p: p.tlType == "bin") scope.pkgs.latexindent.pkgs;
};
};

in applyOverScope compatFixups texlive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I understand what is going on, I managed to verify that

lib.last pkgsCross.aarch64-multiplatform.__splicedPackages.texlive.pkgs.a2ping.pkgs

does not have its own splices. I assume that the binary containers need to be spliced as well, don't they? So we really need the multi-output to make this work properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ideally, the bin containers should also be spliced. This isn't a huge deal yet as there isn't anything cross-compilable that uses the bin containers yet and I doubt users will appear while the interface is this terrible and they aren't really exposed.

But this is probably™ something that fixes itself once the better format for the containers lands.

@apfelkuchen6 apfelkuchen6 marked this pull request as draft August 22, 2023 14:07
@apfelkuchen6 apfelkuchen6 mentioned this pull request Aug 23, 2023
12 tasks
@xworld21 xworld21 mentioned this pull request Aug 23, 2023
12 tasks
this is purely a whitespace change that was separated from the previous commit
to make the diff easier to read
This generic interface allows the addition of texlive-2023 without repeating too
much code or case distinctions

TODO: rename default.nix -> generic.nix, stable.nix -> default.nix
This will be done later as this makes rebasing a lot harder
@apfelkuchen6 apfelkuchen6 force-pushed the texlive-extract-generic branch from 4833e38 to d3ffe62 Compare August 28, 2023 16:16
, recurseIntoAttrs
, fetchurl, runCommand
, ghostscript_headless, harfbuzz
, tlpdb, tlpdbxzHash, src, version, mirrors, useFixedHashes ? true, fixedHashes ? {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is crucial. It took me a while to understand that any flat TeX Live package scope cannot contain its own tlpdb, because the operator // is not lazy enough to allow for that kind of recursion. Hence texlive must be a function of TLPDB, or better of the TeX Live repository: the root of the repo (mirrors), the hash of texlive.tlpdb.xz (tlpdbxzHash), the nixified tlpdb (tlpdb), optionally the source of the binaries (src), optionally fixed hashes, and some metadata e.g. version but also say the snapshot date.

What's missing here is support for other repos, which we should support eventually, for tlcontrib. So we need a suitable generic interface. Best I can suggest, texlive should take an ordered list of respositories which are merged in a single package scope, and each repo is an attribute set containing the above tlpdb, hash, src, mirrors, and e.g. extraVersion, name (not sure about fixedHashes yet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the interface you suggest, but we definitely should think about how to handle tlcontrib now.

Just taking a repos-list parameter makes overriding incredibly clumsy, and I don't think we colud possibly need the full power of that anyway -- There are no other third party repositories. The binary sources should be kept separate as they don't really belong to one of the repositories.

How about separate src and binversion attributes and an attrset repos, where each attribute is itself an attribute with tlpdb and mirrors and optional version, hash and fixedHashes attributes? One small downside is that it's not as obvious in which order the nixified tlpdbs are folded in case of conflicts. Alternatively, I'd also like naming the attribute extraRepos and keeping the attributes for texlive at the toplevel to keep overriding easy.

We could combine the version numbers into something like 2023+texlive-20230828+tlcontrib-20230828, which sounds incredibly awkward, but is accurate!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, repo list is not a great idea. Let's remember that texlive must have a base repo which is an actual TeX Live, binaries included, and the binaries are coupled with the repo (run, doc, tlpkg must match - so one can override the repo part, but not by that much).

Given that, we can only stack repos (tlcontrib) on top of the base with no new binaries. Hence the starting point should be your code, and a passthru function withRepos? With an implicit override behaviour, from left to right, each new repo can overwrite a previous package. And there could be an attribute extraRepos too from the start.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Further thought: tlpdbxz can be a single fetchurl, making tlpdbxzHash and mirrors redundant. And we could put binversion in the passthru of the source fetchurl, saving another argument.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Further thought: tlpdbxz can be a single fetchurl, making tlpdbxzHash and mirrors redundant. And we could put binversion in the passthru of the source fetchurl, saving another argument.)

I think extracting the mirror from the tlpdbxz derivation is somehow unexpected (and explodes if you override it with a local file), so I'd rather pass the mirror in separately even though extracting the url from the derivation should usually do the right thing and save an argument. Minimizing the number of arguments shouldn't isn't really a goal, having a simple and intuitive interface is. Piggy-backing optional stuff such as version numbers onto the derivations is fine though.

Hence the starting point should be your code, and a passthru function withRepos? With an implicit override behaviour, from left to right, each new repo can overwrite a previous package. And there could be an attribute extraRepos too from the start.

Do I understand correctly that you want the interface to be something like

texlive.combined.scheme-minimal.withRepos ( { ... } ).withPackages ( p: [ ... ] )

?
This sounds nice for the user, but a little annoying to implement: if we want withRepos to essentially texlive.override { extraRepos = ... }, it needs “outside access“ to texlive as the override is not available inside the scope. This means that it can be only added as a kind of wrapper and we need to make sure that every externally accessible way to produce a texlive environment adds this modifier to ensure a consistent user interface. At the same time, internal code cannot use these modifiers as they are not applied yet. It should be possible, but it probably involves a lot of spaghetti wrapper code, especially if combined with the other modifiers.

Just allowing to use .override before combining would be a lot simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous 2023-specific overrides (so we probably don't want to just copy python, see #44426)

Re packageOverrides, I'd rather omit it altogether exactly because of that problem. withPackages is more obviously more composable (incidentally, it does not compose for Python). Presumably we'd have withConfig (not the final name) which does the same for other overrides, e.g. extraOutputsToInstall gets concatenated, paperSize goes through //, etc.

So far we seem safe: texlive.override changes the inputs, same as rest of Nixpkgs; texlive.overrideAttrs changes the binaries and/or tlpdbxz, is composable by construction via prev, inherited by xindy et al. if we play it right; texlive.pkgs.overrideScope (possibly lifted/moved to texlive.overrideScope) also inherently composable. Then auxiliary methods withPackages, withConfig, with composability built in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difficult part is making sure that texlive.overrideScope (self: super: { ... }).overrideAttrs {...} does not lose the package overrides. I don't think the naive approach "just works", but it can probably made to work by entangling the different override functions in an interesting manner (similar to applyOverScope here).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the naive approach "just works"

Uhm, I think the only possible implementation is

passthru.overrideScope = f: finalAttrs.finalPackage.overrideAttrs (prev: {
  passthru = prev.passthru // {
    pkgs = prev.passthru.pkgs.overrideScope f;
  };
};

and that should compose as correctly as overrideAttrs itself. I think. If it's not infinite recursion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, there's another piece of design I need to complete buildEnv. We are converging on:

  • .override to change the inputs (e.g. ghostscript = ghostscriptX, or enableStatic = true); standard callPackage but possibly extra wrappers to allow texlive.combine.scheme-small.override
  • .overrideAttrs to modify the core derivation, passthru.tlpdbxz and similar; based on mkDerivation (finalAttrs: _)
  • .overrideScope to modify texlive.pkgs while respecting interdependencies; carefully lifted to texlive for convenience
  • .withPackages to create/add packages to a combined environment

Presumably all of the above methods can be called anywhere, in any order (even .withPackages (_)).override?).

But now we have the config arguments of buildEnv: extraOutputsToInstall, paperSize. Some options:

  1. .override: pass them as global argument (no easy composition)
  2. .overrideAttrs: set them in passthru (ugly but composable)
  3. local .override: just replace the top .override – this is the Python solution buildEnv.override (same composability problems as 1.)
  4. .overrideConfig / .withConfig: one replaces the config, one adds config, e.g. more outputs; not exposed at top level, just in combined schemes, for clarity (good composability via .overrideConfig (prev: ...), of which .withConfig is just a special case)
  5. .overrideEnv: like .overrideConfig but emphasis on what receives the override (but no .withEnv because it is unreadable)

Any other approaches? Now that I listed them, I prefer 4. over the others, but maybe only because it's more fleshed out.

We might still use 2. internally, so that it composes correctly for free, and write .overrideConfig the same way we lift .overrideScope.

(Side note: we have to split core anyway because we can't quite split the libraries, and we need a small dev output for Kpathsea and synctex. Otherwise we also get the LuaTeX library, which is a few MBs.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick update on a very long conversation: I have a proof-of-concept implementation of texlive as a big mkDerivation (finalAttrs: ...), with bin.core, pkgs, tlpdb etc. as passthru attributes. It actually overrides pretty well, enough that changing the TeX Live year becomes as easy as changing a hash when allowing recursive nix. More on this in one or two weeks.

@wegank wegank added 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 2.status: merge conflict This PR has merge conflicts with the target branch labels Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 6.topic: TeX Issues regarding texlive and TeX in general 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants