-
Notifications
You must be signed in to change notification settings - Fork 170
cue: use dots as a separator for folding instead of spaces? #60
Comments
@cup I don't understand why you think this language feature isn't "worth it". Can you elaborate on the cost, so I can better understand why you feel it is too costly? As an aside, CUE isn't a YAML extension, it is a superset of JSON. As such, I'm not fully understanding your YAML example. |
@jlongtine languages dont exist in a vacuum. I would posit that if someone naive
they would expect that it represents this structure: {'outer middle inner': 3} in which case CUE folding is a clear violation of POLA: https://wikipedia.org/wiki/Principle_of_least_astonishment the counter to this would be, having it represent this instead: {
"outer": {
"middle": {
"inner": 3
}
}
} gives you the benefit of short syntax and/or time saved typing. YAML offers a outer:
middle:
inner: 3 with one key difference: comprehension and readability are not sacrificed for https://github.com/cup/umber/blob/master/radio/flow-M.yaml but I would never use CUE folding because of the ambiguity. I am all for shorter, |
This seems like more of a personal issue with the syntax, rather than an actual "issue" with cue. I'm not sure, but I'd be willing to bet that the designers of cue recognized that this is a slightly unorthodox feature but had their reasons to explicitly include it. I agree that it's slightly confusing at first glance, but it isn't exactly a hard concept to grasp. There's a reason the documentation of folding is so short, there's hardly anything to explain. On top of that, dropping this syntax would be a non-backwards-compatible change and would probably require many people (myself included) to update their cue files. This issue seems like too much of a hassle to be worth it 😉 |
@harrygallagher4 thanks for the response, but you seem to be trying to hand
you say "im sure they had a good reason" is not a reason. if they have a good
the point id argue a majority of people stumbling across this code arent going
respectfully I am not concerned with whether you have to rewrite files if this https://github.com/cuelang/cue/releases which means plenty of time and room is available for big removals or additions |
I believe the opposite, that this YAML behavior is the most surprising. There are not many languages that allow unquoted identifiers to include whitespace, so it's natural to parse this as three separate identifiers interacting somehow (and then it just takes one small bit of documentation to resolve what that interaction is). Additionally, CUE's behavior has precedent in other major configuration languages like Nix. Nix does use dots instead of spaces in its version of this syntax, to be fair, but there's even more precedent for languages using spaces where other languages would use dots (Smalltalk, Obj-C, and many others). For me, it would reduce the usability of the language to remove this feature from CUE. |
@cup Alright, I'll do my best to address your concerns in this long-winded reply, excuse the digressions here because we seem to be getting into territory of the philosophy of "good features", when something is "worth it", the relationship of different languages, cynicism, and a lot of other things.
Okay, I'll accept that saying that isn't a reason in itself, but here's why I personally find this useful: You can easily set a deeply nested property: this is obviously useful. It's not even sort of non-standard "out there" feature, most languages have some syntax for doing this commonly with dots but using spaces instead isn't much different. In the case of cue, a small difference is that if any of these mid-level structs don't exist they are created which again is useful in cue's overall goal of reducing boilerplate. In addition, folding doesn't interrupt the indentation flow in order to set just one property. When you have a lot of levels of nesting it can become difficult to navigate your eyes/brain back and forth between all of the
@solson covered this nicely. I'd like to add that avoiding features that people won't understand properly without reading about them is a bad mindset. If cue was created to add syntax sugar to YAML, sure, changing the behavior here would probably be a pretty bad idea, but cue is its own can of worms. If we never implemented new syntax/features just because other languages were different then nobody would ever push the status quo enough to be innovative. I'm not saying that this super simplistic syntax is some huge game-changer, just that if everybody adapted this reasoning we would be stuck with boring, homogenized languages.
Agreed. Perhaps the developers will choose to remove this before 1.0 or change the syntax to use dots or something more approachable. However, I believe I've outlined some reasons why this is useful, and why I believe that dropping something because it's not approachable is a bad idea.
My response (and @solson's as it seems-- sorry to drag you into this) to this is yes. Yours seems to be no. It just comes down to a difference in opinion here. As I'm not a maintainer of cue, I'll leave it up to them and stop flooding the replies to this issue with walls of text, I just personally disagree with the reasoning of "this is slightly confusing, and does something different in YAML, so you should get rid of it" tl;dr: hey bro! i don't wanna rewrite any scripts! |
@cup: thanks for the suggestion. As you mentioned, the time to make suggestions is now. Thanks everyone else as well for contributing to the conversation. In CUE, whitespace has no semantic meaning. Spaces are only relevant within strings, which are always quoted. The only notable exception is the comma separator, which may be elided at the end of a line (a trick borrowed some Go's semicolon elision). As such, it is a bit surprising that The "folding" feature has three main uses, though, and is quite crucial. I'll elaborate on that a bit and will explain what could possibly be done differently. The first reason is the most obvious and most banal one: reducing curly-brace clutter. YAML probably at least partly owes its success to being able to reduce the large number of curly braces as often found in JSON. As whitespace has no meaning in CUE, it cannot take YAML's approach. (This was a deliberate choice. One reason was that CUE is meant to be generated and rewritten by machines as well, something where indentation-sensitive languages don't fare well.) The "folding" approach is quite effective at reducing curly braces as well, just taking a very different approach. Secondly, this idea is actually not new. BCL (Borg Configuration Language), now roughly 15 years old, has a concept of objects, of which a 'job' is one example. Job specifications can often be long and a service may define many jobs. Before long, one loses sight of what fields relate to. So instead of writing:
one writes
This visually makes it easier to place definitions in context and reduces a level of indentation. BCL has its flaws, but it being the third-largest language at Google, we can say by now that this aspect of it works well. Further evidence of this is that this style is also adopted in other languages, like Hashicorp's HCL, which generalizes on this concept. In fact, CUE's interpretation is inspired by HCL's take on this concept. Thirdly, many people are not very familiar with constraint-based languages or typed-feature structures, the foundations of CUE. One of the easiest ways to explain it is first to realize that any JSON configuration can be represented as a collection of paths in a tree associated with some leaf value. Then CUE is a generalization of this by allowing an expression on the left of the That said, one could argue the name separator should have been a Anyway, I appreciate the tendency to want to remove features. I deliberately did not add conditional expressions, have removed functions early on, and am thinking of removing the |
@mpvl I am confused by this, as it seems the exact opposite of what you said is
obviously it would represent this structure:
going along with your argument, you can add certain whitespace with no semantic
still represents:
but the confusing part is, the original example I gave:
directly contradicts what you said, as pretty much every whitespace has |
Mea culpa, I meant syntactically significant whitespace and, more specifically, the amount and type of a particular whitespace in a sequence is insignificant: it doesn't matter whether there is one whitespace or more and it doesn't matter which kind of whitespace. In other words, spaces are only consumed as is as part of quoted literal strings, but the type and number of spaces between tokens never has any meaning. This is in contrast to languages like YAML and Python, for instance, where indentation influences nesting. YAML in particular has a very complex lexer which determines when spaces are part of a literal token, instead of a token separator, but YAML is somewhat unique in this regard. The whitespace handling of CUE is similar to many other languages, btw (C, Go, Java, Swift, JSON, ...). YAML will be full of surprises (POLA violations, if you will) to people only familiar to those. For example, converting YAML to JSON, Anyway, CUE clearly falls into the C/Go/Java/JSON/HCL camp when it comes to how whitespace is interpreted. The curly braces are a reminder of that. In that regard there should be no surprise that Is there a notation for folding, though, you would find less confusing? |
@mpvl I think you just made my case even stronger. Going off your latest
with YAML you get:
Consistent behavior. In both cases, the spaces are considered part of a string
So you can see how the inconsistency with CUE might be a source of confusion and |
Your case was to drop folding as What this example does show, however, is that But CUE adopted the current notation as it’s common with its ancestor languages. With some proposed extensions, the space notation will also look more familiar to typescript users, among others. Spaces look a bit cleaner than dots to me, and they are syntactically not necessary on the LHS as they are on the RHS. It also signifies that a path on the LHS is not entirely the same thing as a selector reference on the RHS (just as a LHS string is not the entirely the same thing as a RHS string in YAML). Finally, to stick with your original objection that it is common for people coming from YAML interpreting the notation as a string: flipping this around and viewing it from someone with a C-heritage background
looks less obvious to me then
even though having a So even though dots would be more consistent, they are not obviously better, to me at least. But it may be my historical bias and perhaps dots are indeed clearer. |
@mpvl using dots instead of spaces would be an acceptable compromise. I would say its much clearer, but it seems we just differ on that point. I dont really have further to add, it seems again the points have been made and thanks |
@cup, thanks for your input. I’ll leave this issue open to father feedback from others regarding dots vs spaces or perhaps other alternatives. Thanks! |
I thank @cup again for raising the issue and the link to HackerNews and others again for their input. I'll keep this issue open for a bit to solicit feedback on this issue. I'll change the title to more to focus on the notation issue, as folding is too important for CUE to give up. I'll focus on other features for now, knowing this change can be made in a backwards-compatible way, and allowing to take into account some of the proposed extensions. |
To throw one more data point in the discussion, there is auxiliary advantage of using dots over spaces:
can be split across lines like
Because of the Go-style comma elision CUE uses, this is not possible with space-separators. |
@mpvl Would this still allow for using constraints in a folding path? I know you've mentioned being able to do things like |
Yes, that is still the plan. This syntax is needed to allow for associative lists, which are are the only thing missing to allow CUE to operate on a K8s API fully natively. If anything else, using dots would make parsing easier. Or more precisely, using dots would make it easier to generate nice error messages. Initially there would be a transition period allowing both and then having The proposal is here btw: https://cue-review.googlesource.com/c/cue/+/2280. Feedback is welcome. There are a few small improvements to be done there, I believe. |
Here is another alternative to consider. Instead of using dots, one could use colons:
This would not be indentation-sensitive, like YAML. Grammar-wise, declarations would be expressions (to some extent). A field without curly braces is then a shorthand for a single-field struct. The thing against it is that it may be even less conventional and now we have something that looks a bit more like YAML, but yet means something very different, arguably being more confusing than the current situation. Consider, for instance,
which would mean
A thing in favor of it, is that it allows indicating the type of field on the path, allowing a mix of definitions and regular fields:
This is not possible with either the space or dot notation, where one would have to write
Another thing in favor, is that it syntactically differentiates the meaning of an LHS vs RHS, while still resulting in a more streamlined syntax (at least so it seems, must investigate). It has some funky implications, though, like what does |
Sorry I dont know enough about the syntax to comment intelligently I would say that if Cue does have some indentation sensitive constructs, that this should not be implemented:
as users would rightfully expect that the indentation means something. if Cue has no indentation constructs, then it would be fair to implement this, as the space doesnt mean anything for Cue in the general case. It might still be confusing as YAML is space sensitive - but thats a weaker argument - similar to comparing Python to Go. In both cases, the latter can use spaces but not required and doesnt have semantic meaning. |
To offer a different perspective, when I first came to CUE, I didn't experience any friction using spaces, and I come from years and years of working in the Javascript/JSON/web space. @cup's experience is what it is, but I think this is less of a "principle of least surprise" issue and more of an issue with breaking old habits and giving CUE the space to be CUE. We're talking about an entirely distinct language and I think CUE has the right to define its own boundaries and context when it comes to "CUE principles of least surprise". If given a vote, I'd keep things the way they are. Whether it's a dot, or a colon, or any other character, the addition feels extraneous and unnecessary after having written thousands of lines of CUE over the past few months. |
@capelio I will be curious if your perspective changes with definitions, especially if they are used to exclude options from the data model, and not just as a "root type". I find having to write
somewhat annoying. There may be a way to relax it, though. |
I agree with @cup. The space notation was a surprise to me. I would be less surprised if dots were used for shorthand notation of nested structs. Though I am open to other, non-space, characters being used. Tools and languages that I have experience with do not use the space as notation, rather it is used as a word (identifier) boundary. Almost any other character could be used in place of the space here and it would be less surprising than the space.
Especially when dots are used for access:
These to things do not seem related to me on first glance. They lack a symmetry of shape. I do also agree with @capelio that CUE should grow and evolve into it's own thing based on it's needs. However, I would err on the side of similar feeling to existing languages and tools, as a way to foster adoption. What good is it to be the best tool if no one uses it due to complexity and confusion. |
@ukiahsmith @cup @rogpeppe We have done some analysis in the mean time about how future extensions could look like, what kind of issues one would run in to with such extensions etc. Personal preferences aside, using space as separator seems unsustainable. That said. Dots solve half the problem, but not all. For one possible future construct that makes it easy to deal with shadowing issues, the dots are actually fatally problematic. Finally, I'm looking into expanding the querying capabilities of CUE. One of the main reasons why this is needed is exactly to be more expressive on the LHS (e.g. to support JSON schema's patternProperties). So this would suggest that symmetry is quite beneficial. However, symmetry gets very confusing when some features of a query language are allowed on the RHS, but not on the LHS. For instance, I could imagine that supporting a JMESPath-like multiselect hash/list make sense on the RHS. It would, however, definitely not make sense on the LHS. Aside from symmetry, I also find the dots don't jive well with the scoping rules, causing syntax and semantics to be misaligned. So the current thinking is to define query syntax modularly (e.g. a
would be both valid YAML and CUE. In CUE one could write this as Note that the spaces originated from GCL and HCL, especially the latter. |
I really like the syntax of I would prefer to not have whitespace significant syntax. I haven't seen evidence of it in other CUE syntax elements, only having it here would be too confusing. Could it also be confusing if it's valid YAML and CUE?
|
As per the spec, the only significant whitespace characters are newlines after some tokens. This wouldn't change anything in that respect - the significant whitespace in the YAML example is not significant in CUE. This would mean exactly the same thing:
|
As a consequence of colon separator syntax the |
I find this unappealing. After first learning how to assign to values this would make me think that these are empty or null.
I would rather the quick nesting syntax be this. I believe this is less of a surprise as to the outcome. I see it and quickly think this is a two in a c value in a b value in an a value.
|
I think the point is that the newline separated syntax is possible but that wouldn't be the conventional way of formatting it, and running it through |
@mpvl Just an early idea, but a possible solution could be to do something like the following by replacing the placeholder field with
|
- parser: parse old and new style. - format: format new style - update generated examples and tutorial Issue #60 Change-Id: I52e0d64932cc850281c96d6ad8d1eafca59f50b0 Reviewed-on: https://cue-review.googlesource.com/c/cue/+/3550 Reviewed-by: Marcel van Lohuizen <[email protected]>
The latest build of CUE contains an implementation of the I hope to post a design doc of the new features soon. In the mean time, feedback on the new syntax is welcome. Note that as a consequence of this change, the "template" syntax ( |
@xinau The |
@cup #165 contains a bit more analysis of the pros and cons, but by no means complete still. But in the end, dots had serious parsing issues and the benefits of using One more example of why colons worked better: compare defining a map of map of ints using the various approaches:
So the decision has been made to go with colons. |
- remove quoted identifiers in favor of using strings and better aliasing Issue #60 Change-Id: I6c6840f59e89678bced077a6cbd9bbe87ec4086e Reviewed-on: https://cue-review.googlesource.com/c/cue/+/3551 Reviewed-by: Marcel van Lohuizen <[email protected]>
This issue has been migrated to cue-lang/cue#60. For more details about CUE's migration to a new home, please see cue-lang/cue#1078. |
This input:
produces this with CUE
and this with YAML:
this seems like too much sugar to be worth it
https://github.com/cuelang/cue/blob/master/doc/tutorial/basics/fold.md
The text was updated successfully, but these errors were encountered: