init checks and 'out' parameters #14521

Araq · 2020-05-31T18:22:48Z

No description provided.

Araq · 2020-05-31T18:26:33Z

Note: Needs an RFC.

Clyybber · 2020-05-31T20:52:54Z

I think a callsite magic to not prevent observable stores is a better idea than introducing out params.
I fear that the existance of outparams will lead people will write two versions of every proc, one with the result being an outparam and the other normally. IMO they should never be needed.

Araq · 2020-05-31T21:01:14Z

It is not about observable stores though, it's about "definite assignment" analysis.

…ode breakage

timotheecour · 2020-05-31T21:55:21Z

if 99% of the time tyOut is treated like tyVar, why not instead introduce tfOut and use that flag in the 1% code locations where you need to distinguish

As you mentioned, new type kinds are expensive in terms of code maintenance (and well, i was kind of hoping to use the tyOpt spot for tyAliasSym in #11992, but it's not just about that)

Araq · 2020-06-01T06:14:38Z

if 99% of the time tyOut is treated like tyVar, why not instead introduce tfOut and use that flag in the 1% code locations where you need to distinguish

Fair enough. I thought a type flag makes the language less "introspectable" because you cannot query a type flag as easily from macros.nim.

timotheecour · 2020-06-01T06:41:40Z

Fair enough. I thought a type flag makes the language less "introspectable" because you cannot query a type flag as easily from macros.nim.

the same will hold for macro writers; I am better that consolidating var+out will lead to less bugs even for macro writers because it'll avoid cases like "oh i forgot to handle out kind", and the majority of cases (like the 99% here) would not care to distinguish out vs var.

Whether to expose type flags is a different story (because of minmizing leaks of internal detail) but it's totally find to introduce an api analog to proc querySetting*(setting: SingleValueSetting): string but for types, where we'd only expose things that are needed for macro clients, possibly tfOut

=> stable macro API hiding internal details that may change

some tidbits of the RFC would be nice too...
i wonder if it makes [RFC] out params: allows to call in-place proc with rvalues RFCs#62 more relevant
why not instead proc fun(a {.out.}: var int) = discard ?

after all, out only makes sense in that context, unlike things like ptr T where ptr makes sense in all contexts, and the minor (in terms of spec difference) difference bw var and out doesnt' justify using a new decorator out

Araq · 2020-06-01T07:36:31Z

why not instead proc fun(a {.out.}: var int) = discard ?

It encourages "lying" as the shorter version is the wrong one.
Compat problems because version 1.0 lacks pragmas for parameters...

timotheecour · 2020-06-01T07:46:49Z

Compat problems because version 1.0 lacks pragmas for parameters...

this compiles with 1.0

when false:
  proc main(a {.out2.}: var int)=
    echo a
  main()

(but in other cases this is a recurring problem: lexer tries to to too much, instead of deferring to semantic; It makes since less useful; we can patch nim 1.0.x+1.2.x to make lexer produce an error node for yet unsupported lexical things, so that since (or when false) will produce no error)

shorter version is the wrong one

that's definitely application specific, there are legitimate cases where var is what you need (eg passing a Table by var). Unless I'm misunderstanding the yet-to-be-written RFC.

Araq · 2020-06-01T07:48:54Z

Unless I'm misunderstanding the yet-to-be-written RFC.

I think you do, read the spec additions please in this PR.

timotheecour · 2020-06-01T08:53:01Z

I had only read changelog + skimmed code, hadn't seen manual.rst diff; now that I've read it I have many questions

my point above remains more valid than ever. I'd actually want to use var instead of out in almost all cases in my code: input-output parameters is the most flexible, practical, even with performance in mind. Not to mention, append semantics is an important use case that out can't do.

It is enforced that every code path assigns a value to every out parameter before the routine returns

that seems to imply that's also the case if it throws; if so, that severely restricts what you can do for initialization.
In fact even the example you gave violates the spec:

proc divmod(a, b: int; res, remainder: out int) =
    res = a div b # if this throws
    remainder = a mod b
var res, remainder
divmod(0,0, res,remainder) # this raises, and `remainder` was unassigned even though the routine returned

it's even worse in following example:

proc divmod(a, b: int; a2: out int, res: out int) =
  a2 = a # observable store
  res = a div b # raises

divmod(0,0, a2, res) # a2 was modified, but not res (see also ObservableStores discussion)

so that implies you can only initialize the out params inside the proc with code that doesn't raise (or only raises defects AND user has --panics:on)

that means the only thing you can use in practice is use builtin constructors eg myparam = A(x: 10, y: "foo"), even if user doesn't care about any of those out guarantees

performance

It is enforced that every code path assigns a value to every out parameter before the routine returns and before a read from the out parameter can be performed

If I'm reading this right it means the proc itself can't read any out params before all the out params were assigned (otherwise it'd be redundant with before the routine returns)

A typical use case for var param is buffer reuse, eg:

proc toStr[T](result: var string, b: T) =
  result.setLen 0
  # etc

with out param, this wouldn't work because you'd require: result = "" before any operation.
in particular, result.setLen 0 followed by result.add $T is actually violating the out spec since it reads result.capacity. Even if you allow this (say because it's "builtin"), that doesn't generalize to user defined types, eg if user has a custom type, it'd prevent any use of caching, debugging fields, isInitialized fields etc; in effect user would have to decouple the "input-output fields" from the "output-only fields" before calling such proc. I don't see how that'd even work with a user-defined MySeq[T] type (with len, capacity, data)

breaking change

Seeing as you had to add result = 0 (etc) initializers in many places, this makes me wonder whether this would cause code breakage.

out = `{.requiresInit.}` ?

can't {.requiresInit.} fit the bill for out, eg: proc fn(a: int, b {.requiresInit.}: var int) = discard

desired goals + alternative approach?

I'm wondering what are the real benefits here to outweigh the other factors, and if there's alternative approach.
it seems like the key sentence is:

There is no semantic difference between ``proc p(x: out T) {.raises: [].}`` and ``proc p(): T {.raises: [].}``
but how these constructs are mapped to machine code might differ.

but the proposal needs more details here; if there is no semantic difference, then the only reason for out is that it would be more efficient; but maybe there's another way to achieve that.
Eg instead of nimZeroMem for result variable, semantic phase could use a user defined reinit which is similar to reset except it allows keeping memory buffers; it's the generalization of setLen 0 for user defined types. eg:

proc joinPath(a, b: string): string =
  # result.reinit # implicitly called by semantic phase
  result.add a & '/' & b

var c: string
for i in 0..<10:
  c = joinPath("foo", "bar") # calls c.reinit before each iteration, which for string is `c.setLen 0`

=> no allocs, buffers are reused, and the friendlier return syntax is used

Araq · 2020-06-01T12:13:16Z

desired goals + alternative approach?

More precise C interop (see stat) plus it allows us to improve the existing practice of using (res: var T): bool instead of Option[T]. Note that we all agree the existing practice is bad but it turns old lying code into code that doesn't lie anymore without API breakages. I will expand this point in the RFC.

Futhermore, out parameters finally allow us to write re-usable constructors for object hierarchies. More in the RFC.

Araq · 2020-06-01T12:14:50Z

but in other cases this is a recurring problem: lexer tries to to too much, instead of deferring to semantic; It makes since less useful; we can patch nim 1.0.x+1.2.x to make lexer produce an error node for yet unsupported lexical things, so that since (or when false) will produce no error

I think more often than not we got it right, see for example how the parser always allowed for let x {.importc.} .

timotheecour · 2020-06-01T13:14:44Z

More precise C interop (see stat)

for me, that's just 1 of several parameter specfiers, eg: restrict, const, and I don't see a reason to single this out.
Note also that, unlike restrict and const, out doens't exist in C so it's purely a matter of "faith" in fact that C function behaves like an out param.

And in many cases it doesn't behave like out:

sprintf (input buffer only partially written to)
memmove (even memcpy, which is undefined behavior if regions overlap; not sure whether UB counts)
realloc (more like input output)

semantic of `out` in D

in D, out just means parameter a is initialized upon function entry with a.init (our default(T)); that actually is IMO already a better semantic as it's done by the compiler instead of require require user to do it (and then checking with static analysis); and can be optimized away when compiler can prove its safe to do so. And it's in fact similar to how nim handles result initialization (or variable initialization) via nimZeroMem.

That said, IIRC it's not a very popular feature as oftentimes input output params are more practical.

Araq · 2020-06-01T14:13:40Z

that actually is IMO already a better semantic as it's done by the compiler instead of require require user to do it (and then checking with static analysis); and can be optimized away when compiler can prove its safe to do so.

Maybe sure, but for now I consider it an unimportant detail.

planetis-m · 2020-06-02T05:51:24Z

I think out parameters is a different enough concept (from var) that deserve their one language constract and not hidden / confused with var.

zah · 2020-06-03T08:56:25Z

BTW, out is already used for specifying covariant generic parameters.

I would check if this patch breaks the test suite with -d:nimEnableCovariance (Araq, as a reminder, you've promised to consider making this the default)

Araq · 2020-06-03T16:16:45Z

-d:nimEnableCovariance (Araq, as a reminder, you've promised to consider making this the default)

I have no interest whatsoever in covariance. :-/

zah · 2020-06-03T16:33:13Z

The covariant generic parameters seem like an obscure feature now, but they will become quite important when people start defining smart pointer types in Nim. The feature is essential for preserving the sub-type relationship between ptr Derived and ptr Base when using custom pointer types.

timotheecour · 2020-06-03T20:26:28Z

@zah => timotheecour#285 to avoid hijacking this

Araq · 2020-06-04T12:41:18Z

Note: The upcoming RFC for out parameters will likely be considerably different from this implementation.

Araq · 2020-06-23T08:08:42Z

Note: I removed out parameters again but kept the minimal support for it in the parser because it might be useful if we ever add this feature to Nim.

timotheecour · 2020-06-24T08:47:26Z

@Araq @Clyybber I just want to make sure we're not going in circles

some of the changes in this PR relating to result initialization are not so good IMO (if they're indeed needed, it would need some kind of explanation):

for examples things like this:
https://github.com/nim-lang/Nim/pull/14521/files#diff-31a643844935ff24fb24f54b472f38cbR110
which adds result = "" which doesn't seem necessary; since no change in semantics to result read before initialization was done so it's not warranted, and in fact a ton of code relies on this so it doesn't make sense to have these initializations added in cases like this

and in fact that change gets reverted here: https://github.com/nim-lang/Nim/pull/14777/files?file-filters%5B%5D=#diff-31a643844935ff24fb24f54b472f38cbL110

Araq added 7 commits May 31, 2020 14:19

I don't care about observable stores

96bade5

enforce explicit initializations

02646f0

cleaner code for the stdlib

c153e7b

stdlib: use explicit initializations

52de22e

first implementation of 'out' parameters

204217a

documented 'out' parameters

dba6965

more refinements for 'out' parameters

8fa0d39

Araq added 2 commits May 31, 2020 21:09

more fixes for 'out' parameters

84c2d23

code improvements; added toutparam test

f6f5a28

coding style improvements, back to nnkMutableTy in order to prevent c…

42b3ec4

…ode breakage

make tests green

6d72f2c

algorithm.nim: set result explicitly

01163de

timotheecour mentioned this pull request Jun 3, 2020

does -d:nimEnableCovariance suffer from same issues as in java? timotheecour/Nim#285

Open

Araq added 5 commits June 22, 2020 21:02

fixes merge conflict

cf75f11

remove out parameters and bring the PR into a mergable state

85133a5

updated the changelog

a559d1d

Merge branch 'devel' into araq-init-checks

e3e7c62

undocumented the 'out' parameters

22d0e68

Araq merged commit da29222 into devel Jun 23, 2020

Araq deleted the araq-init-checks branch June 23, 2020 08:54

timotheecour mentioned this pull request Jul 12, 2020

result assignment should stay optional, except for var/lent result nim-lang/RFCs#241

Closed

metagn mentioned this pull request Nov 25, 2020

out T is documented in manual as alternative to var T but simplest example breaks #16131

Closed

canelhasmateus mentioned this pull request Nov 21, 2021

Deal with remaining disabled tests nim-works/nimskull#71

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

init checks and 'out' parameters #14521

init checks and 'out' parameters #14521

Araq commented May 31, 2020

Araq commented May 31, 2020

Clyybber commented May 31, 2020 •

edited

Loading

Araq commented May 31, 2020

timotheecour commented May 31, 2020

Araq commented Jun 1, 2020 •

edited

Loading

timotheecour commented Jun 1, 2020 •

edited

Loading

Araq commented Jun 1, 2020

timotheecour commented Jun 1, 2020

Araq commented Jun 1, 2020

timotheecour commented Jun 1, 2020 •

edited

Loading

Araq commented Jun 1, 2020

Araq commented Jun 1, 2020

timotheecour commented Jun 1, 2020 •

edited

Loading

Araq commented Jun 1, 2020

planetis-m commented Jun 2, 2020

zah commented Jun 3, 2020

Araq commented Jun 3, 2020

zah commented Jun 3, 2020 •

edited

Loading

timotheecour commented Jun 3, 2020

Araq commented Jun 4, 2020

Araq commented Jun 23, 2020

timotheecour commented Jun 24, 2020

init checks and 'out' parameters #14521

init checks and 'out' parameters #14521

Conversation

Araq commented May 31, 2020

Araq commented May 31, 2020

Clyybber commented May 31, 2020 • edited Loading

Araq commented May 31, 2020

timotheecour commented May 31, 2020

Araq commented Jun 1, 2020 • edited Loading

timotheecour commented Jun 1, 2020 • edited Loading

Araq commented Jun 1, 2020

timotheecour commented Jun 1, 2020

Araq commented Jun 1, 2020

timotheecour commented Jun 1, 2020 • edited Loading

performance

breaking change

out = {.requiresInit.} ?

desired goals + alternative approach?

Araq commented Jun 1, 2020

Araq commented Jun 1, 2020

timotheecour commented Jun 1, 2020 • edited Loading

semantic of out in D

Araq commented Jun 1, 2020

planetis-m commented Jun 2, 2020

zah commented Jun 3, 2020

Araq commented Jun 3, 2020

zah commented Jun 3, 2020 • edited Loading

timotheecour commented Jun 3, 2020

Araq commented Jun 4, 2020

Araq commented Jun 23, 2020

timotheecour commented Jun 24, 2020

Clyybber commented May 31, 2020 •

edited

Loading

Araq commented Jun 1, 2020 •

edited

Loading

timotheecour commented Jun 1, 2020 •

edited

Loading

timotheecour commented Jun 1, 2020 •

edited

Loading

out = `{.requiresInit.}` ?

timotheecour commented Jun 1, 2020 •

edited

Loading

semantic of `out` in D

zah commented Jun 3, 2020 •

edited

Loading