Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]Overriding #330

Merged
merged 37 commits into from
Dec 30, 2021
Merged

[RFC]Overriding #330

merged 37 commits into from
Dec 30, 2021

Conversation

yannham
Copy link
Member

@yannham yannham commented Mar 19, 2021

This document contains a proposal for an overriding mechanism.

Related issues: #103, #240, #255, #279

notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
Comment on lines 652 to 654
But what if `val1 & val2` is in a thunk that happens to be forced in between?
What should `let x = val1 & val2 in x & (val3 | merge func)` evaluate to? And
`let x = val1 & val2 in builtins.seq x (x & (val3 | merge func))`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think forcing and “closing” a record should be too distinct operations.

The way I understand the rest of the semantics above, there's two kind of records: open and closed. Open records are the one available in the surface language and with all the shiny overriding semantics while closed records are an internal thing only.

If we represent closed records with {| |} (to distinguish them from open records { }), we have that

  1. { x = 1, y = x } == fun self => {| x= 1, y = self.y |} (the repr(r) operator above)
  2. {| something |} == let fix = { something } fix in fix (what we can call the close(r) operator)

To rephrase what you defined before, we then have that for an open record r, r.foo == close(r).foo.

All this to say that val1 & val2 is still an open record (whether you force it or not), so let x = val1 & val2 in builtins.seq x (x & (val3 | merge func)) is the same as val1 & val2 & (val3 | merge func)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not even thinking about records, but values in general. The problem is, morally, that using a custom merge function amounts to interpret the merge AST with a different semantics for &. If we want it to be non-local, as in only one term with an annotation in the merge expression is sufficient to change the merge function for all operands, you need to have access to the whole original AST. Forcing thunks may evaluate sub-expressions with the original merge semantics (and not the custom, because it is not in the scope of the sub-expression), and this changes the result.

Example:

let add = fun x y => x + y in
let val | merge add = 1 in
// Say we do "the right thing", so this is 3 as expected
1 & 1 & val

let x = 1 & 1 in
// This has no chance of being 3
// x is forced, without any custom merge function in sight.
// `1 & 1` is thus computed to be `1` (merge is idempotent)
// The thunk of x is updated with 1, while the original expression `1 & 1` is lost.
// so x is `1` and the final result is `2`
seq x (x & val)

Merging numbers directly is dubious, but you can imagine them being values for an option: { ..., someOption | Num | merge add, ...}.

The next paragraph notices that, though, records will need to be instrumented to store the original expression along the computed thunk (as a way of representing open records), and in that case we can make it work. Which is I think what you meant? But, in general, val1 and others are not necessarily records nor record fields.

yannham and others added 6 commits March 19, 2021 17:10
Co-authored-by: Théophane Hufschmitt <[email protected]>
Co-authored-by: Théophane Hufschmitt <[email protected]>
Co-authored-by: Théophane Hufschmitt <[email protected]>
Co-authored-by: Théophane Hufschmitt <[email protected]>
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Show resolved Hide resolved
@yannham yannham added this to the first release milestone Apr 2, 2021
notes/overriding.md Outdated Show resolved Hide resolved
The basics of this overriding mechanism are close in spirit to what we want to
achieve in this proposal. This is similar to Nickel's current `default`
behavior, but with the desired late-bound merging. This is however not enough
for Nix use cases: we will need to add other features to the merge system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other features?

Cf what I said the other day about naming stuff. You can start by naming the features that you need, and then explain which features are met or not by different solutions. The text will flow much better. And it will help frame the conversation and the documentation in the future.

bar.baz = "stuff",
bar.blorg = false,
}
let defaulted | default rec = someNeutralConfig;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If default rec is a new keyword that you are introducing, please say so.

notes/overriding.md Show resolved Hide resolved
adapted set of priorities. For example:

1. `default` is lowest than everything else (bottom)
2. Integer priorities in the middle, with `0` being the default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to pull out the remark “0 being the default`” in its own paragraph, because it deserves justification.


Custom merge functions would be specified by a `merge` metavalue attribute. In
order to enforce commutativity, they would receive their arguments in an
indistinguishable order, such as `{lower: Dyn, higher: Dyn}`, just knowing which
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that the custom merge functions will probably need to know when they are getting two terms of equal priority.

order to enforce commutativity, they would receive their arguments in an
indistinguishable order, such as `{lower: Dyn, higher: Dyn}`, just knowing which
one has a higher priority than the other. If both have the same priority, the
order is not specified, and may even be randomized by the interpreter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You definitely want to specify how merge functions will work precisely.

notes/overriding.md Show resolved Hide resolved
Copy link
Contributor

@Radvendii Radvendii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my thoughts as an avid Nix user with almost no experience of Nickel. Let me know if I'm way off base and need to learn more about Nickel before commenting.


I don't understand after reading this why the NixOS module - style updating is the preferred option. It seems that the main advantage is commutativity, but it also introduces a good deal of complexity. As a user, I find the Nix module system quite mysterious compared to the semantics of lib.recursiveUpdate, //, or even overrlays, which are all non-commutative.

Modules always "just work" for me because the merge functions were designed very carefully to do the right thing in all of those cases. If something went wrong though, I think it would be fairly confusing to work out where or why. As a user, I'm not confident in my ability to consistently design those merge functions correctly.

As such, I think this is an incredibly powerful approach in something like modules, which has a very clear develop / use divide, but I'm not sold on it for general records.

It's not clear to me what the simple / basic use case is. The use of "override this record with some other values" (// or lib.recursiveUpdate) requires extra syntax (at least the force rec you introduce at the end), and isn't even guaranteed to work, if there has already been a forced value. On the other hand, setting up a generic way for attributes to be merged requires writing a function and adding a | merge func. This is a lot of effort for something farily basic. I'm left feeling like there is no simple / obvious way to change records.

infinite range of integers priorities in between. Doing so, it's easy to just
use `default` or `force` to override or be overridden by anything, without
knowing the precise integer priority. Integer priorities still gives freedom
with an infinite supply of levels if required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a problem of "what arbitrary integer priority do I assign this? The only way I can think to figure this out is look at everything else that sets that value, and figure out exactly what you want to override and be overridden by, and if people didn't leave enough space between their integers, you have to adjust their values. Maybe this is the best we can do, but it doesn't seem very ergonomic to me.

Maybe this has already been addressed in the NixOS module system? Just reading this document it's not clear to me how to decide on an integer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's currently the biggest complaint I have with the priority system. One solution is to only allow very few and natural priorities (normal, default and force e.g.), but it seems this would be too limiting for Nix. Maybe @edolstra , @infinisil or @regnat have more to say on the practicality of priorities.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need only a few more, we could go with default < weak < normal < strong < force, but this seems like a fundamental problem of assigning these kinds of priorities. Unless there are very clear tiers of priority that naturally arise from the problem space, what you're really doing is putting an ordering on a bunch of things that are spread out across the code base.

Another example of this kind of thing is haskell's infix priorities. These work alright for a few different things ($), which is a "default" kind of thing, and PEMDAS operations, which have a clearly defined ordering which has to be drilled into your head as a child. I guess in general it works for multiplcation / addition type things where one operation distributes over another. In general though, if I have two random operations? I put in parentheses, because I don't have an intuition for what should be going on.

notes/overriding.md Show resolved Hide resolved
notes/overriding.md Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
@yannham
Copy link
Member Author

yannham commented Jun 23, 2021

First, thanks for the review, @Radvendii. Familiarity with Nickel is not required: having an external point of view is valuable.

Just my thoughts as an avid Nix user with almost no experience of Nickel. Let me know if I'm way off base and need to learn more about Nickel before commenting.

I don't understand after reading this why the NixOS module - style updating is the preferred option. It seems that the main advantage is commutativity, but it also introduces a good deal of complexity. As a user, I find the Nix module system quite mysterious compared to the semantics of lib.recursiveUpdate, //, or even overrlays, which are all non-commutative.

The merge system is inspired in part from NixOS, but it is not exactly the same thing. It's also inspired from CUE. I would argue that the semantics of merging is more natural than // in Nix, that doesn't do the intuitive thing:

  • Usually you want the semantics of lib.recursiveUpdate for nested fields, but // doesn't do that
  • You would like recursive records to be updated in the expected way and not as in the port/protocol example here, but // doesn't do that

So you need three different notions (and actually there are even more beside overlays for overriding) defined in different places, working in different ways, to do the same thing in Nix currently. Overlays and co also don't work with basic records, as recalled in this document. Meanwhile:

  • Merging two records recursively is just r1 & r2
  • If you need to override something, you just have to write one additional annotation explicitly r1 & (r2 | force rec) for example. I think requiring an explicit marker for overriding is a good thing in general, not only from a technical/commutativity point of view.

Modules always "just work" for me because the merge functions were designed very carefully to do the right thing in all of those cases. If something went wrong though, I think it would be fairly confusing to work out where or why. As a user, I'm not confident in my ability to consistently design those merge functions correctly.
As such, I think this is an incredibly powerful approach in something like modules, which has a very clear develop / use divide, but I'm not sold on it for general records.

It's not clear to me what the simple / basic use case is. The use of "override this record with some other values" (// or lib.recursiveUpdate) requires extra syntax (at least the force rec you introduce at the end), and isn't even guaranteed to work, if there has already been a forced value. On the other hand, setting up a generic way for attributes to be merged requires writing a function and adding a | merge func. This is a lot of effort for something fairly basic. I'm left feeling like there is no simple / obvious way to change records.

I think the use case is modularity. We are talking about the NixOS module system a lot here, but the question is probably how to provide the language features to be able to write big and complex configurations in separate blocks that can then be combined, reused and possibly customized.

The general idea is inspired from CUE, where merging is ubiquitous, rather than it is an ad-hoc re-implementation of the NixOS system. It just turns out NixOS is a good inspiration for enhancements than CUE doesn't need nor want to provide, as it is intentionally very restricted in processing and computing power.

To answer more precisely to some of your points:

As such, I think this is an incredibly powerful approach in something like modules, which has a very clear develop / use divide, but I'm not sold on it for general records.

I'm genuinely curious (as just a basic user of Nix/NixOS), what makes NixOS modules different from, say, writing a big Kubernetes configuration in the same modular style with respect to this develop/use ratio?

and isn't even guaranteed to work, if there has already been a forced value.

True. Note that in CUE things can only be overridden once, and it is by design, as multiple levels of overriding is seen as the root of all evil (overriding being in some sense similar to inheritance). I've been told Nix requires more 'monkey-patching' capabilities. This has been discussed in #240, with a proposal to be able to access the priority of values to implement an equivalent of //.

On the other hand, setting up a generic way for attributes to be merged requires writing a function and adding a | merge func. This is a lot of effort for something farily basic.

I think calling custom merge functions basic is a bit unfair. As proposed here, merging is akin to building a free monoid with a default implementation for records (and a trivial one for other values). Custom merge functions would allow to re-interpret some part of the AST with a user-defined function, as a form of overloading. To my knowledge, neither Nix, CUE, Jsonnet nor Dhall supports something like this out of the box.

I'm left feeling like there is no simple / obvious way to change records.

I don't think it is the case. Jsonnet, for example, features a fairly simple mechanism for overriding: it is basically // that does the right thing for recursive records. The question is how to handle the case where several values are defined for the same field. Jsonnet encodes this directly in the order in which records are extended: the last (right-most) value wins. The NixOS module system and CUE's merge system, on the other hand, rely on a priority system (with only a bottom and a top for CUE, so field can be overridden only once). Those are the different tradeoffs that need to be discussed here, keeping in mind the use-case of Nickel as, on one side, a "general-purpose" configuration language, but also as a possible future for Nix.

@github-actions github-actions bot temporarily deployed to pull request August 4, 2021 07:55 Inactive
@github-actions github-actions bot temporarily deployed to pull request August 4, 2021 08:15 Inactive
Co-authored-by: Arnaud Spiwack <[email protected]>
@github-actions github-actions bot temporarily deployed to pull request August 4, 2021 09:08 Inactive
@Profpatsch
Copy link

Imo the biggest problem with the nix module system is that it’s anti-modular.

That is, in order to find the actual value of an option, you have to first strictly evaluate all modules to the point where you see whether it is overwritten. This makes for a real performance problem in modern nixpkgs, where evaluating a nixos system configuration can take dozens of seconds, even with just a small number of options set (although some of it is excarbated by option definitions being read strictly afaik).

I can’t remember much of this proposal, but in order to scale to very big e.g. kubernetes configs, that has to be a consideration.

@roberth
Copy link

roberth commented Aug 17, 2021

you have to first strictly evaluate all modules

Not strictly evaluate but you do have to parse them and evaluate a small amount to find that its contribution to config is e.g. { config = mkIf false <bottom>; }.

The problem is that NixOS still imports everything, relying on *.enable instead of an accurate imports closure. This softened module system requirements by removing failure modes relating to ill-specified imports. If we checked imports hermeticity, it seems we could cover most of those. Some modules will have to be split to allow their interface to be used (to no effect) without their implementation.

In this proposal for Nickel I don't see a way to include an externally defined interface. It would be possible with the "dynamic scoping" option which wasn't chosen. The chosen "lexical scoping" seems to require that all field references are declared locally, reminding me of structural types instead of the nominal "I want to interact with the network configuration". Perhaps "network" could be an alias for or reference to the structural type. Is that possible? Is it possible to write a Nickel library that resolves imports?

@yannham
Copy link
Member Author

yannham commented Aug 24, 2021

In this proposal for Nickel I don't see a way to include an externally defined interface. It would be possible with the "dynamic scoping" option which wasn't chosen. The chosen "lexical scoping" seems to require that all field references are declared locally, reminding me of structural types instead of the nominal "I want to interact with the network configuration". Perhaps "network" could be an alias for or reference to the structural type. Is that possible?

Indeed. You can always give names/aliases to structural types, as you suggest (currently general type aliases are not supported, but it wouldn't be hard to do, and we are mostly talking about contracts here which can already be given names using let-binding).

Concerning the inclusion of a dynamic interface, one possibility is to store the external information inside a field that can be guarded by a contract that is imported dynamically, as in:

let moduleFoo = import "foo.ncl" in

{
  baz = if foo.bar then "enabled" else "disabled",
  foo | moduleFoo.Interface,
}

But indeed, you can't do that at the top-level of the record in the current form of the proposal, by design. We can imagine adding a way to do that, such as:

{
  baz = if bar then "enabled" else "disabled",
  ..moduleFoo.FooInterface,
}

This adds back dynamic scoping though, because moduleFoo.FooInterface is not necessarily an evaluated expression, so the interpreter can't always know in advance which fields are defined or not. This construct could require moduleFoo.FooInterface to be an evaluated record, but I expect that some contracts will be build using merging (like in let Extension = Contract1 & Contract2), so this would be limiting.

We could also just use the .. without further ceremony to allow dynamic scoping explicitly:

// Error: bar is undefined
{
  foo = bar
} 

// Ok
{
  foo = bar,
  ..
}

Is it possible to write a Nickel library that resolves imports?

Sorry, I'm not sure to totally understand the question.

Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just catching up here, and have run out of time to read the meatiest part of the proposal, at the end.

I do have one key question, though: why is commutativity desired? I don't think of record-update as a commutative operation, and I'm not sure why it should be. If we lose the desire for commutativity, we have new design space to work with.

notes/overriding.md Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
notes/overriding.md Outdated Show resolved Hide resolved
@github-actions github-actions bot temporarily deployed to pull request September 1, 2021 16:42 Inactive
yannham and others added 2 commits September 2, 2021 17:26
Co-authored-by: Richard Eisenberg <[email protected]>
Co-authored-by: Richard Eisenberg <[email protected]>
@github-actions github-actions bot temporarily deployed to pull request September 2, 2021 15:30 Inactive
This was referenced Sep 3, 2021
@yannham yannham mentioned this pull request Dec 11, 2021
@mboes
Copy link
Member

mboes commented Dec 29, 2021

#491 seems to assume this RFC is the way forward. Any reason we're keeping it from hitting master just yet?

@yannham yannham mentioned this pull request Dec 29, 2021
@blaggacao
Copy link

blaggacao commented Dec 30, 2021

Out of the perceived pain with NixOS Module system commutativity, I had written a modified recursive merge that mostly does what I want / expect.

As others have noted above, I would resume that pain in that: "local reasoning about a NixOS configuration is impossible." - Add undisciplined operators / daily hurry and over time, your're signed up for the perfect storm, an unmaintainable "magic" configuration, that you cannot refactor any more at reasonable cost - everyone basically just mkOverrides one patch at a time into the future.

CUE only partially mitigates this by limiting to the priority of one, but it's still commutative, hence jumping back and forth between hops or, as others noted: just evaluate & see, trial & error - a bad strategy for a config language that should ideally build a mental map of the state of the world while (progressively) reading the source code (top to bottom, left to right).

My take can be scrutinized at https://github.com/divnix/data-merge. To a regular recursive merge, it mainly adds merge function annotations to deal with merges of non-associative arrays (lists).

It also is monotone, a great, human-friendly simplification, which means that the RHS can never destructively modify the "data spine" of the LHS, where the data spine consists of the type structure + (associative or non-associative) array items. It means that the RHS cannot:

  • change a value type (similar to the effects of how CUE types are values in the same lattice / "typed by example")
  • delete an attribute set (associative array) item
  • delete a list (non-associative array) item

When reading such a configuration, and it's variants/inheritances/overlays/overrides, things can only "progress" monotonely with a felt "+" operator (on the data spine).

It also forces you to take extra care and design your bases for overlays in a monotone-compatible way.

I was one of those who got a bit sad when kustomize started to allow the unlink operation in jsonpatches: monotonicity of the data spine was broken.

@yannham
Copy link
Member Author

yannham commented Dec 30, 2021

@blaggacao thanks for your input, we'll look into it. We already had a fair amount of priority-based vs extension/override-based discussions, but it's still a valuable data point.

In the meantime, I agree with @mboes, since this PR has been open for way too long. It has received a fair number of review passes, and it's not like we have any actual RFC process yet that would make merging it irreversible or fundamentally hand-tying. I think we can move new issues, requests and discussions to new issues.

@yannham yannham merged commit cfb2821 into master Dec 30, 2021
@yannham yannham deleted the doc/overriding-proposal branch December 30, 2021 18:08
@github-actions github-actions bot temporarily deployed to pull request December 30, 2021 18:13 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants