Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax flaw: Function types and error sets ambiguity #1681

Closed
Hejsil opened this issue Oct 24, 2018 · 6 comments
Closed

Syntax flaw: Function types and error sets ambiguity #1681

Hejsil opened this issue Oct 24, 2018 · 6 comments
Milestone

Comments

@Hejsil
Copy link
Contributor

Hejsil commented Oct 24, 2018

This simplified grammar of the function type syntax is ambiguous:

FnProto = Keyword_fn LParen RParen TypeExpr
TypeExpr = LeafExpr ExclamationMark LeafExpr | LeafExpr
LeafExpr = Identifier | FnProto
// fn () (fn () a!a)
  0: FnProto
  1: Keyword_fn LParen RParen TypeExpr
  2: Keyword_fn LParen RParen LeafExpr
  3: Keyword_fn LParen RParen FnProto
  4: Keyword_fn LParen RParen Keyword_fn LParen RParen TypeExpr
  5: Keyword_fn LParen RParen Keyword_fn LParen RParen LeafExpr ExclamationMark LeafExpr

// fn () (fn () a)!a
  0: FnProto
  1: Keyword_fn LParen RParen TypeExpr
  2: Keyword_fn LParen RParen LeafExpr ExclamationMark LeafExpr
  3: Keyword_fn LParen RParen FnProto ExclamationMark LeafExpr
  4: Keyword_fn LParen RParen Keyword_fn LParen RParen TypeExpr ExclamationMark LeafExpr
  5: Keyword_fn LParen RParen Keyword_fn LParen RParen LeafExpr ExclamationMark LeafExpr
@ghost
Copy link

ghost commented Oct 24, 2018

My answer assumes the language wants to have the ability to put arbitrary types on the left hand side of '!' as this may be useful to interface with C's possibly upcoming _Either(A, B) metatype which would allow Zig to interface with C++'s possibly upcoming value-based exceptions. Arbitrary types on the left hand side would also allow Zig-only code to propagate something other than an error set up the stack, if that's something people may want in the future. If people want this then maybe it's not a good idea to significantly simplify the grammar for '!' or the TypeExpr, but I think you could:

A) Change TypeExpr to not include FnProto, forcing parentheses in additional cases such as "fn () (fn () void)"
B) Change the grammar for "!" to make the left hand side simpler, forcing parentheses in fewer cases

@andrewrk andrewrk added this to the 0.5.0 milestone Oct 24, 2018
@andrewrk
Copy link
Member

@Hejsil is there a problem with resolving these syntax ambiguities by choosing which syntax is more common and then accepting that as the resolution to the ambiguity?

The big problem with the return type syntax was that it led to a confusing compile error, it was really easy to trigger the ambiguity, and required hacks in the parser. On the other hand this issue (and some of the other syntax flaw issues) seem innocuous to me.

Do you see these issues as problems specifically with the formal grammar specification, or do you see them as problems that plague the language and require syntax changes?

@ghost
Copy link

ghost commented Oct 24, 2018

fn foo() fn() error!void {}
fn foo() (fn() error!void) {} // opt 1
fn foo() (fn() error)!void {} // opt 2

If you'd like opt 1 to be the default, make the LHS of '!' what it expects now minus 'fn'. Btw @Hejsil, "TypeExpr" is already being used in the grammar. I just noticed.

@Hejsil
Copy link
Contributor Author

Hejsil commented Oct 25, 2018

@andrewrk All the ambiguities I report are in the formal grammar.

There are a few reasons why I think we should resolve the ambiguities in the grammar instead of "the parser chooses what is more common":

  • Certain ambiguities I've found are not parses how they are most used:
async<comptime A> fn()void {}
// Parsed like this async<(comptime A> fn()void) {}
  • If we expect other people to implement parsers that are consistent with the Zig compiler, then a none ambiguous grammar is a nice tool to have (It'll make it easier to have stage1 and 2 be consistent with each other).

Having an unambiguous grammar makes it harder to have syntax problems like #760. It also allows for fast prototyping with parser generators which also allows us to validate, that the syntax we chose does not interact weirdly with other syntactic constructs.

Ofc, we could also just leave the grammar ambiguous. I just think it defeats the purpose of having one.

@Hejsil
Copy link
Contributor Author

Hejsil commented Oct 25, 2018

Just looked a little more into formalized grammars, and it seems that there is something called a Parsing expression grammar. The main strength of this grammar form is that it cannot be ambiguous:

the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.
Unlike CFGs, PEGs cannot be ambiguous;

This would make it easier to specify a correct grammar, though we still have to look out for rules that would require unlimited lookahead:

Due to the unlimited lookahead capability that the grammar formalism provides, however, the resulting parser could exhibit exponential time performance in the worst case.
It is possible to obtain better performance for any parsing expression grammar by converting its recursive descent parser into a packrat parser, which always runs in linear time, at the cost of substantially greater storage space requirements.

Ofc, a handwritten implementation can use as many tricks as it wants to avoid doing unlimited lookahead.

@Hejsil
Copy link
Contributor Author

Hejsil commented Nov 13, 2018

Fixed: #1685

@Hejsil Hejsil closed this as completed Nov 13, 2018
@andrewrk andrewrk modified the milestones: 0.5.0, 0.4.0 Apr 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants