Wrong symbol lookup in template expansion in generic function #22605

arnetheduck · 2023-08-31T14:21:52Z

Description

type Xxx* = enum
  error
  value

type
  Result*[T, E] = object
    when T is void:
      when E is void:
        oResultPrivate*: bool
      else:
        case oResultPrivate*: bool
        of false:
          eResultPrivate*: E
        of true:
          discard
    else:
      when E is void:
        case oResultPrivate*: bool
        of false:
          discard
        of true:
          vResultPrivate*: T
      else:
        case oResultPrivate*: bool
        of false:
          eResultPrivate*: E
        of true:
          vResultPrivate*: T

template valueOr*[T: not void, E](self: Result[T, E], def: untyped): untyped =
  let s = (self) # TODO avoid copy
  case s.oResultPrivate
  of true:
    s.vResultPrivate
  of false:
    when E isnot void:
      template error: untyped {.used, inject.} = s.eResultPrivate
    def

proc f(): Result[int, cstring] =
  Result[int, cstring](oResultPrivate: false, eResultPrivate: "f")

proc g(T: type): string =
  let x = f().valueOr:
    return $error

  "ok"

echo g(int)

Related to #22599 but this time error has its own scope

Nim Version

1.6.14

Current Output

(oResultPrivate: false, eResultPrivate: "error")

Expected Output

(oResultPrivate: false, eResultPrivate: "f")

Possible Solution

No response

Additional Information

No response

edit: simplified

The text was updated successfully, but these errors were encountered:

arnetheduck · 2023-08-31T14:23:10Z

proc h(): string =
  let x = f().valueOr:
    return $error

  "ok"

echo h()

works as expected

metagn · 2023-08-31T15:15:36Z

Minimized:

type Xxx = enum
  error = "bad"

template valueOr(self: int, def: untyped): untyped =
  case false
  of true: ""
  of false:
    template error: untyped {.used, inject.} = "good"
    def

proc g(T: type): string =
  let x = 123.valueOr:
    return $error

  "ok"

echo g(int)

int param is needed, because full untyped parameters cause direct expansion of the macro in the generic proc.

Generic expansion captures enum field error and can't let go of it even with mixin. All of the following cause it:

type Xxx = enum
  error = "bad"

const error = "bad" 

template error: untyped = "bad"

arnetheduck · 2023-08-31T15:17:48Z

can you think of any workaround?

metagn · 2023-08-31T15:31:28Z

Making valueOr fully untyped, adding an unused const error = () inside the generic proc, or making a fully untyped wrapper template that does that (adds const error {.inject.} = () then calls valueOr). Supposedly in the future generic procs will be fully typed which would deal with this

arnetheduck · 2023-09-01T07:11:53Z

adding an unused const error = () inside the generic proc

this works, but unfortunately it is not transparent to the user of valueOr

Making valueOr fully untyped, fully untyped wrapper template that does that

Can't seem to get these right, what am I missing?

type Xxx* = enum
  error
  value

type
  Result*[T, E] = object
    when T is void:
      when E is void:
        oResultPrivate*: bool
      else:
        case oResultPrivate*: bool
        of false:
          eResultPrivate*: E
        of true:
          discard
    else:
      when E is void:
        case oResultPrivate*: bool
        of false:
          discard
        of true:
          vResultPrivate*: T
      else:
        case oResultPrivate*: bool
        of false:
          eResultPrivate*: E
        of true:
          vResultPrivate*: T

template valueOr2*(self, def: untyped): untyped =
  let s = (self) # TODO avoid copy
  case s.oResultPrivate
  of true:
    s.vResultPrivate
  of false:
    when self.E isnot void:
      template error: untyped {.used, inject.} = s.eResultPrivate
    def

template valueOr(self, def: untyped): untyped =
  const error = ()
  valueOr2(self, def)

proc f(): Result[int, cstring] =
  Result[int, cstring](oResultPrivate: false, eResultPrivate: "f")

proc g(T: type): string =
  let x = f().valueOr:
    return $error

  "ok"

echo g(int)

metagn · 2023-09-01T08:47:22Z

The const error needs to be {.inject.}ed, but there is another problem: valueOr/valueOr2 need to be called in normal call syntax rather than method call syntax for early untyped macro expansion to work, which could probably be fixed

Doing both things, then using valueOr but making valueOr2 typed and generic gives a wrong number of variables error which might be another bug

Fwiw this works:

template valueOr(self, def: untyped): untyped =
  const error {.inject.} = ()
  valueOr2(self, def)

proc g(T: type): string =
  let x = valueOr f():
    return $error

  "ok"

echo g(int)

The original valueOr2 will also work when used like valueOr2 f(): as long as it's untyped, without needing the const error injection

Edit: So the minimal working example is:

type Xxx* = enum
  error
  value

type
  Result*[T, E] = object
    ...

template valueOr*(self, def: untyped): untyped = # workaround part 1: make valueOr untyped
  ...

proc f(): Result[int, cstring] =
  Result[int, cstring](oResultPrivate: false, eResultPrivate: "f")

proc g(T: type): string =
  let x = valueOr f(): # workaround part 2: use normal call syntax instead of method call syntax
    return $error

  "ok"

echo g(int)

See also nim-lang/Nim#22605

metagn · 2023-09-07T09:17:44Z

Fwiw, the same issue exists with templates only (mixin still doesn't help):

template valueOr(self, def: untyped): untyped =
  block:
    template error: untyped {.used, inject.} = "good"
    def

const error = "bad"
template g =
  let x = 123.valueOr:
    $error
  echo x
g

I don't think this is less than a language design problem, and a pretty old one at that. So can't really give any solutions.

arnetheduck · 2023-09-07T09:32:34Z

@metagn Interesting repro indeed!

We've been aware of this problem in general for quite some time and implemented various random workarounds in "leaf" code like trying to come up with unique names for things - but this of course is not sustainable - specially not for "core infrastructure" like Result or even sequtils and I think it's urgent / important to address it - we have had this problem many times now and it's a gotcha that gets in the way of productivity - it's often discovered at runtime when the aliased symbols happen to be of compatible-enough types, and often the result of an upgrade of some transitive dependency which otherwise looks completely unrelated and is impossible to audit for this kind of breakage.

ie here's a good way to break any codebase - stick this in some commonly imported module:

type Breakit* = enum
  it

Araq · 2023-09-07T13:46:48Z

In the long run I want to replace untyped with typed.
That means code snippets are semchecked before they are passed to the template/macro.
This is totally consistent with every other builtin language construct btw like if and case and for.
Semchecked code is fundamentally at odds with symbol injections. Unless.
We pre-analyze templates for inject'ed symbol and require macros to declare what they inject. This only works when every overload of the template agrees with the set of injected symbols. This is like today's rule for when they need to agree on the position of untyped parameters and so we know it will cause problems. However, there is nothing we can do about it and when it does come up, one can always disambiguate via module.template.

metagn · 2023-09-07T14:58:39Z

Semchecked code is fundamentally at odds with symbol injections.

Well it's not difficult to make this use case semantically legible, if we have template lambdas we can take a template (error: E): untyped argument the user has to write.

The problem here isn't with untyped parameters, it's with untyped bodies with speculative symbol resolution, which is not a necessary design for generics but is almost the entire reason templates are used. So we need to think about the template case, but it also doesn't hurt to have generics working before the typechecking is implemented.

The template/generic body captures the error symbol because it thinks there's no other option, but the template is injecting a symbol that would ordinarily be preferred. Since it's pretty late to get rid of template injections entirely, we need to find logic to account for cases like this; I think developing template injection semantics more could instead make this logic more complex, but I wouldn't know.

It might be recency bias from the overloadable enum scope resolution issues but I think the simplest missing design is 2 dimensional symchoices, where the existing symchoices have 1 dimension where every choice has equal weight (since they are meant for routines which undergo overload resolution), but a second dimension of scope depth also exists in normal symbol resolution. Thankfully the way this dimension is used in symbol resolution is simple enough that we can probably represent this with just an optional "preferred" choice in the sym choice.

In the example, error would be an nkOpenSymChoice (since it's in a mixinContext) with 1 child (the captured sym), marked as preferred because it's unambiguous in scope. If the template taking the parameter doesn't inject an error, then it can resolve to its child; but if it does, since it's an open sym choice, the injected error would overtake it as the preferred choice during the semantic pass on the template output and then will become the resolved sym. Currently error is just the captured nkSym instead, because it wouldn't make sense for it not to be, because there is no logic to resolve sym choices in the compiler outside of overload resolution and type inference, which we can't use here.

arnetheduck · 2023-09-07T16:41:27Z

Just to further confirm that it is not the injected template that fails, this also breaks for injected variables:

template valueOr*[T: not void, E](self: Result[T, E], def: untyped): untyped =
  let s = (self) # TODO avoid copy
  case s.oResultPrivate
  of true:
    s.vResultPrivate
  of false:
    when E isnot void:
      let error {.inject.} = s.eResultPrivate # def also fails to pick this up
    def

Varriount · 2023-09-16T07:37:49Z

@metagn What would happen if all template symbols were open by default? What kinds of code would that break?

metagn · 2023-09-16T09:50:36Z

The compiler just doesn't semcheck symchoices, they're only handled in routine overloading or when narrowing to a type. So just emitting open symbols doesn't work for cases like this issue where the symbol isn't overloadable, like a constant or a type. Example of nothing being done:

import macros

const foo = 123
block:
  const foo = 456
  macro baz() =
    # this just happens to create a symchoice currently:
    result = bindSym("foo", brOpen)
    echo result.treeRepr # OpenSymChoice 2 "foo"
  let x = baz() # Error: expression has no type: foo

While fixing this would need some sifting through the compiler to flesh out where to preserve or resolve symchoices, we also need some disambiguation mechanism; which routines naturally have because they get called and overloaded, but other symbols don't (hence recent overloadable enum issues).

# a.nim
const foo* = 123

# b.nim
import a
const foo = 456
template bar*(): int =
  mixin foo
  foo

# c.nim
import b
echo bar()

If we just emitted open symchoices, foo here would become:

OpenSymChoice
  Sym "foo" # from b
  Sym "foo" # from a

The compiler can't disambiguate these when we call bar() in c.

In the scope of the template declaration, foo would normally directly resolve to the local b.foo. The compiler could detect this and output this instead:

OpenSymChoice
  Sym "foo" {preferred} # from b
  Sym "foo" # from a

There's no foo accessible from c, so the output symchoice could directly resolve to the marked b.foo. If we instead have:

# c.nim
import b
const foo = 789
echo bar()

Being an open symchoice, it will check the local scope again for an unambiguous foo, and if it finds it, can resolve to it instead of the ones from the sym choice.

This is all wishful thinking, it needs a lot of cooperation with the rest of the compiler and a sound design, unfortunately I haven't attempted anything yet.

Araq · 2023-09-16T16:08:27Z

This is all wishful thinking, it needs a lot of cooperation with the rest of the compiler and a sound design, unfortunately I haven't attempted anything yet.

Add a new node kind nkSemSymChoice where the first entry is to be preferred.

Varriount · 2023-09-16T19:13:28Z

So what happens if there are 2+ overloads in the same module as the template? Will those be preferred over other overloads?

metagn · 2023-09-16T20:47:45Z

No, neither will be preferred. If it's ambiguous at template scope, we can't mark any as preferred.

Varriount · 2023-09-17T06:35:28Z

No, neither will be preferred. If it's ambiguous at template scope, we can't mark any as preferred.

Hm, then I still feel it's a tad arbitrary and inconsistent, and increases the language's complexity (adding yet another edge case).

metagn · 2023-09-17T10:24:20Z

For the general case, there is a way simpler design, that multiple people have conceptually arrived at already. We can have nkOpenSym that when semchecked gets replaced by local syms if they exist or turns into an nkSym otherwise. This solves this issue.

The preferred stuff is a generalized version that happens to also solve another problem which is sym choices not retaining scope information in case they need to be disambiguated. This problem has technically always existed but is niche and only relevant now because of enums.

I might have made a mistake in combining these problems, and we can probably have both solutions separately. The second problem is probably not nearly as important either, sorry for bringing it up here if it turns out to be separate.

refs #22605 Sym choice nodes are now only allowed to pass through semchecking if contexts ask for them to (with `efAllowSymChoice`). Otherwise they are resolved or treated as ambiguous. The contexts that can receive symchoices in this PR are: * Call operands and addresses and emulations of such, which will subject them to overload resolution which will resolve them or fail. * Type conversion operands only for routine symchoices for type disambiguation syntax (like `(proc (x: int): int)(foo)`), which will resolve them or fail. * Proc parameter default values both at the declaration and during generic instantiation, which undergo type narrowing and so will resolve them or fail. This means unless these contexts mess up sym choice nodes should never leave the semchecking stage. This serves as a blueprint for future improvements to intermediate symbol resolution. Some tangential changes are also in this PR: 1. The `AmbiguousEnum` hint is removed, it was always disabled by default and since #22606 it only started getting emitted after the symchoice was soundly resolved. 2. Proc setter syntax (`a.b = c` becoming `` `b=`(a, c) ``) used to fully type check the RHS before passing the transformed call node to proc overloading. Now it just passes the original node directly so proc overloading can deal with its typechecking.

fixes nim-lang#11184, fixes nim-lang#22605, fixes nim-lang#20000

fixes nim-lang#22605, separated from nim-lang#22744

fixes #22605, separated from #22744 This marks symbol captures in macro calls in generic contexts as `nfOpenSym`, which means if there is a new symbol in the local instantiatied body during instantiation time, this symbol replaces the captured symbol. We have to be careful not to consider symbols outside of the instantiation body during instantiation, because this will leak symbols from the instantiation context scope rather than the original declaration scope. This is done by checking if the local context owner (maybe should be the symbol of the proc currently getting instantiated instead? not sure how to get this) is the same as or a parent owner of the owner of the replacement candidate symbol. This solution is distinct from the symchoice mechanisms which we originally assumed had to be related, if this assumption was wrong it would explain why this solution took so long to arrive at.

fixes #22605, separated from #22744 This marks symbol captures in macro calls in generic contexts as `nfOpenSym`, which means if there is a new symbol in the local instantiatied body during instantiation time, this symbol replaces the captured symbol. We have to be careful not to consider symbols outside of the instantiation body during instantiation, because this will leak symbols from the instantiation context scope rather than the original declaration scope. This is done by checking if the local context owner (maybe should be the symbol of the proc currently getting instantiated instead? not sure how to get this) is the same as or a parent owner of the owner of the replacement candidate symbol. This solution is distinct from the symchoice mechanisms which we originally assumed had to be related, if this assumption was wrong it would explain why this solution took so long to arrive at. (cherry picked from commit 9416595)

@Araq

This PR adds a workaround for the case where a global symbol is matched instead of the local `error`/`value` template in `valueOr` and friends, when `valueOr` is being used in a generic context. Two options are added: * when using Nim version supporting the experimental `genericsOpenSym` [feature](nim-lang/Nim#23873), we use it * when not, a macro tries to replace all accesses in the given body with the correct symbol - this matching is done by name and some minimal heuristics which potentially could be wrong - the upstream solution is obviously preferable Both solutions can be disabled via compile-time options to retain the old behavior of matching the global symbol. See also nim-lang/Nim#22605 thanks to @Araq for the macro hack and @metagn for the language fix!

refs #22605 Sym choice nodes are now only allowed to pass through semchecking if contexts ask for them to (with `efAllowSymChoice`). Otherwise they are resolved or treated as ambiguous. The contexts that can receive symchoices in this PR are: * Call operands and addresses and emulations of such, which will subject them to overload resolution which will resolve them or fail. * Type conversion operands only for routine symchoices for type disambiguation syntax (like `(proc (x: int): int)(foo)`), which will resolve them or fail. * Proc parameter default values both at the declaration and during generic instantiation, which undergo type narrowing and so will resolve them or fail. This means unless these contexts mess up sym choice nodes should never leave the semchecking stage. This serves as a blueprint for future improvements to intermediate symbol resolution. Some tangential changes are also in this PR: 1. The `AmbiguousEnum` hint is removed, it was always disabled by default and since #22606 it only started getting emitted after the symchoice was soundly resolved. 2. Proc setter syntax (`a.b = c` becoming `` `b=`(a, c) ``) used to fully type check the RHS before passing the transformed call node to proc overloading. Now it just passes the original node directly so proc overloading can deal with its typechecking. (cherry picked from commit 5f9038a)

arnetheduck changed the title ~~Wrong template lookup in generic function~~ Wrong symbol lookup in template expansion in generic function Aug 31, 2023

arnetheduck mentioned this issue Sep 1, 2023

results valueOr access to error template injected symbol fails with certain generic callers arnetheduck/nim-results#34

Closed

Araq added the Showstopper label Sep 1, 2023

arnetheduck added a commit to arnetheduck/nim-results that referenced this issue Sep 4, 2023

Workaround for #34

5ffb184

See also nim-lang/Nim#22605

arnetheduck mentioned this issue Sep 4, 2023

Workaround for #34 arnetheduck/nim-results#35

Merged

This was referenced Sep 11, 2023

Templates: Method call prefers injected proc to outside, normal call doesn't #20000

Open

undeclared identifier error when using untyped within proc taking generic arg #21249

Open

Inconsisten Symbol Binding Rules in Templates/Generics #11184

Open

metagn mentioned this issue Sep 17, 2023

make expressions opt in to symchoices #22716

Merged

metagn added a commit to metagn/Nim that referenced this issue Sep 23, 2023

open syms

3e67494

fixes nim-lang#11184, fixes nim-lang#22605, fixes nim-lang#20000

metagn mentioned this issue Sep 23, 2023

open + preferred syms #22744

Closed

3 tasks

metagn added a commit to metagn/Nim that referenced this issue Sep 29, 2023

open syms

795d0d1

fixes nim-lang#11184, fixes nim-lang#22605, fixes nim-lang#20000

arnetheduck mentioned this issue Oct 16, 2023

Write to Future.internalValue directly in macro-translated code status-im/nim-chronos#452

Open

Araq added a commit that referenced this issue Nov 27, 2023

fixes #22605; WIP

cf20d4c

metagn added a commit to metagn/Nim that referenced this issue Dec 18, 2023

allow replacing captured syms in macro calls in generics

341b04d

fixes nim-lang#22605, separated from nim-lang#22744

metagn mentioned this issue Dec 18, 2023

allow replacing captured syms in macro calls in generics #23091

Merged

Araq closed this as completed in #23091 Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong symbol lookup in template expansion in generic function #22605

Wrong symbol lookup in template expansion in generic function #22605

arnetheduck commented Aug 31, 2023 •

edited

Loading

arnetheduck commented Aug 31, 2023 •

edited

Loading

metagn commented Aug 31, 2023 •

edited

Loading

arnetheduck commented Aug 31, 2023

metagn commented Aug 31, 2023 •

edited

Loading

arnetheduck commented Sep 1, 2023

metagn commented Sep 1, 2023 •

edited

Loading

metagn commented Sep 7, 2023

arnetheduck commented Sep 7, 2023

Araq commented Sep 7, 2023

metagn commented Sep 7, 2023

arnetheduck commented Sep 7, 2023 •

edited

Loading

Varriount commented Sep 16, 2023

metagn commented Sep 16, 2023 •

edited

Loading

Araq commented Sep 16, 2023

Varriount commented Sep 16, 2023

metagn commented Sep 16, 2023 •

edited

Loading

Varriount commented Sep 17, 2023 •

edited

Loading

metagn commented Sep 17, 2023 •

edited

Loading

Wrong symbol lookup in template expansion in generic function #22605

Wrong symbol lookup in template expansion in generic function #22605

Comments

arnetheduck commented Aug 31, 2023 • edited Loading

Description

Nim Version

Current Output

Expected Output

Possible Solution

Additional Information

arnetheduck commented Aug 31, 2023 • edited Loading

metagn commented Aug 31, 2023 • edited Loading

arnetheduck commented Aug 31, 2023

metagn commented Aug 31, 2023 • edited Loading

arnetheduck commented Sep 1, 2023

metagn commented Sep 1, 2023 • edited Loading

metagn commented Sep 7, 2023

arnetheduck commented Sep 7, 2023

Araq commented Sep 7, 2023

metagn commented Sep 7, 2023

arnetheduck commented Sep 7, 2023 • edited Loading

Varriount commented Sep 16, 2023

metagn commented Sep 16, 2023 • edited Loading

Araq commented Sep 16, 2023

Varriount commented Sep 16, 2023

metagn commented Sep 16, 2023 • edited Loading

Varriount commented Sep 17, 2023 • edited Loading

metagn commented Sep 17, 2023 • edited Loading

arnetheduck commented Aug 31, 2023 •

edited

Loading

arnetheduck commented Aug 31, 2023 •

edited

Loading

metagn commented Aug 31, 2023 •

edited

Loading

metagn commented Aug 31, 2023 •

edited

Loading

metagn commented Sep 1, 2023 •

edited

Loading

arnetheduck commented Sep 7, 2023 •

edited

Loading

metagn commented Sep 16, 2023 •

edited

Loading

metagn commented Sep 16, 2023 •

edited

Loading

Varriount commented Sep 17, 2023 •

edited

Loading

metagn commented Sep 17, 2023 •

edited

Loading