Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CS2] add parens to chained do IIFE #4666

Merged
merged 3 commits into from
Aug 29, 2017

Conversation

helixbass
Copy link
Collaborator

Fixes #3736

@GeoffreyBooth this implements the behavior that @darky was looking for in #3736. I think the compilation he wants is almost always what someone would want when chaining after a do -> ... (as discussed, the only real use case for chaining off of a function literal (like now enabled by #4665) would seem to be .call()/.apply()). So I think it's actually sketchy that it currently (post-#4665) unintuitively compiles the chain against the function object

So now this:

do ->
  a
.b

will compile to:

((function() {
  return a;
})()).b;

(ie (do -> a).b)
rather than the current:

(function() {
  return a;
}).b();

(ie do ((-> a).b)

But behavior for chained do <non-IIFE> shouldn't change, ie this:

do a
.b

should continue to be treated as do (a.b). This one has bitten me a couple times but I see how both ways could be considered intuitive and I've learned to avoid it

So the added rewriter pass just targets do followed by -> or => (ie an IIFE) whose body is followed by a ., and wraps the do <IIFE> in parens (added tests for indented and inline function bodies as well as that existing chained do <non-IIFE> behavior is preserved)

I think the inconsistency this would introduce of chained do <IIFE> vs chained do <non-IIFE> is far outweighed by the unintuitive, non-useful current compilation of chained do <IIFE>

action = (token, i) ->
return unless token[0] is '.'
@tokens.splice doIndex, 0, generate '(', '(', @tokens[doIndex][2]
@tokens.splice i + 1, 0, generate ')', ')', ['', 'implicit parentheses', token[2]]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth @lydell I'm not too sure what these origin args should be?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

origin is just a reference to whatever token we copied the location data from. So for (, that should be the do token, and for ), that should be whatever token immediately precedes the ) after it's inserted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth ok updated origin args to reference those tokens (I believe)

condition = (token, i) ->
@tag(i - 1) is 'OUTDENT'
action = (token, i) ->
return unless token[0] is '.'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be token[0] in CALL_CLOSERS to include :: etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeoffreyBooth updated to in CALL_CLOSERS and changed one test to use ?. instead of .

action = (token, i) ->
return unless token[0] is '.'
@tokens.splice doIndex, 0, generate '(', '(', @tokens[doIndex][2]
@tokens.splice i + 1, 0, generate ')', ')', ['', 'implicit parentheses', token[2]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

origin is just a reference to whatever token we copied the location data from. So for (, that should be the do token, and for ), that should be whatever token immediately precedes the ) after it's inserted.

@GeoffreyBooth
Copy link
Collaborator

Looks great! @lydell or anyone else want to review?

@GeoffreyBooth
Copy link
Collaborator

One could argue that this belongs in addImplicitBracesAndParens, as you’re adding implicit parentheses; but one could also argue that that function is overstuffed as it is and should be broken up.

@jashkenas
Copy link
Owner

This really shouldn't need an extra rewriter rule — it's a sign that we haven't fully thought through the grammatical model. Ideally, there would be a change to the grammar that would make this work instead.

@helixbass
Copy link
Collaborator Author

@jashkenas I thought about that but it looked like at that point you're in the realm of operator precedence. Seemed much more dangerous/sweeping to mess with that rather than just handle this specific case (at least for now)

@GeoffreyBooth it seems that way based on naming but addImplicitBracesAndParens() is so implicit-object/implicit-call specific. And yes, didn't seem like there would be anything clean about trying to wedge this new logic into that hairy beast

@GeoffreyBooth
Copy link
Collaborator

Well, do doesn’t exist in the grammar, nor is it a node class. It’s really this funky keyword that gets disposed of in the lexer/rewriter. Maybe it shouldn’t be, but that’s a much bigger refactor.

Personally I think we should get rid of do, but not for 2.0.0 as it’s an unnecessary breaking change. I feel like it’s our version of JavaScript’s with.

@GeoffreyBooth
Copy link
Collaborator

I take back the first part of my previous comment. It’s part of the Op class. So yeah, maybe we can handle this in nodes?

@helixbass
Copy link
Collaborator Author

@GeoffreyBooth I think this lives on the lexer/rewriter/grammar side of the fence, in nodes you'd have to be restructuring the node tree based on detecting this particular case?

And for the record -1 on getting rid of do. I use it for IIFEs constantly as a way to encapsulate related code (that I may not want to refactor into a separately scoped method) and help greatly simplify the amount of state/control flow I need to try and visually parse/keep track of when reading a function body

@GeoffreyBooth
Copy link
Collaborator

Fair enough. I guess maybe it’s a question of style, I very rarely use IIFEs.

As for lexer/rewriter/nodes, where is () added? If it’s added in the lexer or rewriter, then that’s where this fix should go. If they’re created in nodes, then we should handle this there. @jashkenas?

@carlsmith
Copy link
Contributor

carlsmith commented Aug 29, 2017

Yeah, I often use do -> as well, mostly to keep names as local as possible, something like...

updateElement = do ->
  $element = $ selector
  return -> $element.method arg

I would actually like CoffeeScript to introduce with expressions, but instead of doing what JS does, it would instead just bind the object to this, avoiding the issue with clobbering. For example:

with jQuery then @ajax args...

I'm a bit of a Crockford acolyte, so rarely reference this, but still wish I could use the @name shorthand. I'm not sure that this kind of with expression would be as useful otherwise.

@jashkenas
Copy link
Owner

jashkenas commented Aug 29, 2017

Well, do doesn’t exist in the grammar, nor is it a node class. It’s really this funky keyword that gets disposed of in the lexer/rewriter. Maybe it shouldn’t be, but that’s a much bigger refactor.

Yes. It started out as a keyword, (eg. do value) and is now something a little more special. To parse it correctly, ideally we'd make it its own node/terminal in the grammar. Then it would either turn into a keyword, or into a special IIFE form: do -> -> IIFENode. Then we can just parse with the parser and not with hacks ;)

The rewriter is supposed to only be for syntax that the parser really can't handle. Not things that it can.

Edit: Also, I'd be +1 for removing do, but it's probably too late now. I didn't want it to be added in the first place — it was a concession for folks who wanted to shadow variables.

Edit2: Original introduction / reasoning here: #959

@GeoffreyBooth
Copy link
Collaborator

@jashkenas I feel like the refactor you're describing should be its own PR. Would you be okay with merging this as it is and we can tackle that cleanup separately?

@jashkenas
Copy link
Owner

Sure -- you're calling the shots. Go for it.

@helixbass
Copy link
Collaborator Author

@GeoffreyBooth it looks like it's added in nodes (Op::generateDo()), but I think that's the problem is that by that time it's too late - under current behavior, chained do <IIFE> will have already been parsed with the chain against the function object (not the executed function), so in nodes you can't really back out of that. Whereas if do <IIFE> were in whatever sense equivalent to <IIFE>() from the perspective of a subsequent chain (by or at the time parsing happens) then the special case rewriting wouldn't be necessary

So in theory @jashkenas is right, and do should be handled differently at the grammatical level (I was surprised when I saw that it was just an ordinary old unary operator). If I get to where I feel like I fully understand how it's currently handled, I'd be happy to tackle trying to restructure it that way. But I don't see how we wouldn't then run into the chained do <non-IIFE> case (see PR description) very quickly - it'd presumably be parsed/handled the same way as this handles chained do <IIFE>, which is a breaking change - like I said, I've been bitten by its current behavior but I can see both sides and I'm sure people use/rely on its current behavior. To rephrase, I don't see how if we tweak the grammar to make this:

do ->
  a
.b

parse as (do -> a).b (ie get .b to be lower precedence than do by whatever means) it wouldn't also cause this:

do a
.b

to parse as (do a).b (rather than the current do (a.b)). Introducing this breaking change would be even a little ickier since there's no distinguishing whether the chaining appears indented, ie the last example is the same as:

do a
   .b

So (at least for now) I don't see why handling this in the rewriter is bad, it's pretty clean as such things go

@carlsmith exactly, it's great for "localizing" state needed by a closure like that. And I like the style of that with idea but can't imagine it would make it through the firing squad. I tend to use a bound_to helper for the same effect, eg:

bound_to jQuery, ->
  @ajax args...
  ...

where bound_to = (context, clb) -> clb.call context

@helixbass
Copy link
Collaborator Author

ah @jashkenas ya if you're comfortable distinguishing do <IIFE> from do <non-IIFE> at the grammatical level then we could probably preserve existing chained do <non-IIFE> behavior while supporting the chained do <IIFE> behavior that this PR supports

But so ya if you guys are good with it I'd say merge this for now and I can plan on tackling it at a grammatical level later (and at that point presumably remove the need for this rewriting)

@jashkenas
Copy link
Owner

@helixbass That's fundamentally what you're doing with this change: Distinguishing do -> from do value. So yes, we agree. It's inconsistent, but probably what you intend, and can be handled in the rewriter for now, but should be moved to the grammar later.

The fewer passes over the source code the rewriter has to do, the better. They're one of the more expensive (in terms of compile time) changes you can make.

@GeoffreyBooth
Copy link
Collaborator

Okay, I’m going to merge this in now because most people would consider the #3736 behavior a bug, and this fixes that. Then if we want to refactor, that can be a new PR. I don’t know if that effort needs an issue (other than this one) but feel free to open one if you want a reminder or some way to discuss implementation before submitting a PR.

There are other similar features that should probably get some similar refactoring love, like interpolated strings/CSX, or chaining overall; and I might prioritize those more highly than do, which I still assume at least has to be less commonly used than interpolated strings. But they’re all messy parts of the compiler that would be good to clean up.

@GeoffreyBooth GeoffreyBooth merged commit e54b8a1 into jashkenas:2 Aug 29, 2017
@connec
Copy link
Collaborator

connec commented Aug 30, 2017

This doesn't work when the IIFE has arguments:

do (a) ->
  a
.b
# (function(a) {
#   return a;
# }).b()

@helixbass
Copy link
Collaborator Author

@connec good point, opened #4672. But this just makes it more obvious that this ultimately belongs in the grammar rather than the rewriter

@carlsmith
Copy link
Contributor

@helixbass - Thanks. Yeah, with is a non-starter, though to be fair, your bound_to example shows CoffeeScript is expressive enough that with would be cute, but almost useless.

GeoffreyBooth pushed a commit that referenced this pull request Aug 31, 2017
* add parens to chained do iife [Fixes #3736]

* remove debug code

* fixes from code review

* handle iife with params

* add test w/ destructured param from code review
@GeoffreyBooth GeoffreyBooth mentioned this pull request Sep 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants