Type checking and unreachable blocks #707

MikeHolman · 2016-06-10T18:51:39Z

Feels like there are some corner cases with type checking, and AFAIK there hasn't really been much discussion on it. Right now I am looking at cases like the following pseudo-code for a function which has been defined with an int result type:

(func
  (if condition
    (ret (const 1))
  else
    (ret (const 2))
  )
)

Should this validate? All paths return an int but the if/else doesn't yield a value.

If that should validate, what about this?

(func
  (if condition
    (ret (const 1))
    (const 1.5)
  else
    (ret (const 2))
    (const 1.5)
  )
)

Same type of scenario, but now it's arguable whether we consider that the if/else yields float or doesn't yield. If it yields, then I suppose we fail validation, because that (unreachable) path yields a float.

How are you guys handling this type of scenario today?

The text was updated successfully, but these errors were encountered:

binji · 2016-06-10T19:01:19Z

@rossberg-chromium is probably the best to explain this, but I'll try :)

They simple way I think of it is that there are two extra types, void and any (i.e. bottom and top). return and br always have type any, because they are unconditional control flow. if has a type which is the set intersection of its branch's types. A block always has the type of its last expression.

So in the first example, it validates because the type of the if is any, which includes i32.

In the second example, the two branches have a float type, because the last expression of the (implicit) block is a f32 (say), so the if has the same type. Therefore the function does not validate, because the two type sets (i32 and f32) are disjoint.

titzer · 2016-06-13T08:03:57Z

On Fri, Jun 10, 2016 at 9:01 PM, Ben Smith [email protected] wrote:

@rossberg-chromium https://github.com/rossberg-chromium is probably the
best to explain this, but I'll try :)

They simple way I think of it is that there are two extra types, void and
any (i.e. bottom and top). return and br always have type any, because
they are unconditional control flow. if has a type which is the set
intersection of its branch's types. A block always has the type of its last
expression.

So in the first example, it validates because the type of the if is any,
which includes i32.

In the second example, the two branches have a float type, because the
last expression of the (implicit) block is a f32 (say), so the if has the
same type. Therefore the function does not validate, because the two type
sets (i32 and f32) are disjoint.

Ben's right. This is exactly how type checking works in the V8 validator:
bottom up. Unreachable, return, br, and br_table always have a type any,
and any unified with a type T is always type T. Blocks have the type which
is the "unification" of all the br's that target, with the last expression
in a block considered to be an implicit br.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#707 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ALnq1Dn4HC64VujAJco1dm1TeKP7ZTg9ks5qKbSbgaJpZM4IzOWR
.

rossberg · 2016-06-13T10:27:14Z

One nit: it's not unification but computing the LUB (at least until we decide for #694).

(And super-nit: I'd personally avoid the names void and any for this, since they usually have the exact opposite meaning, i.e., any as a top type void as a bottom type, whereas here it's the other way round.)

As for the question why the second example does not type-check: it's a very simple and conventional type system. Like most type systems, it only looks at data flow, not control flow. So whether there is an unconditional branch somewhere in a block does not affect the block's type.

binji · 2016-06-13T18:34:42Z

(And super-nit: I'd personally avoid the names void and any for this, since they usually have the exact opposite meaning, i.e., any as a top type void as a bottom type, whereas here it's the other way round.)

Ha, I knew I'd mess that up. :)

MikeHolman · 2016-06-13T21:00:53Z

Sounds good to me, thanks for the clarification!

titzer · 2016-06-14T15:26:51Z

On Mon, Jun 13, 2016 at 12:27 PM, rossberg-chromium <
[email protected]> wrote:

One nit: it's not unification but computing the LUB (at least until we
decide for #694 #694).

Right, yes. There are no type variables, so it's just walking up a type
lattice.

(And super-nit: I'd personally avoid the names void and any for this,
since they usually have the exact opposite meaning, i.e., any as a top
type void as a bottom type, whereas here it's the other way round.)

As for the question why the second example does not type-check: it's a
very simple and conventional type system. Like most type systems, it only
looks at data flow, not control flow. So whether there is an unconditional
branch somewhere in a block does not affect the block's type.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#707 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ALnq1Fjdd9PiepvSUd9qUeRDhc0tNmjGks5qLTCggaJpZM4IzOWR
.

Cellule · 2016-06-22T21:54:04Z

So whether there is an unconditional branch somewhere in a block does not affect the block's type.

I have run into some issue regarding this from the testsuite.
in block.wast in function break-inner
There's the statement (set_local 0 (i32.add (get_local 0) (block (i32.ctz (br 0 (i32.const 0x4))))))
I fail to see how this is valid with the current state of things.
br 0 does not yield a value, therefore, i32.ctz doesn't have an operand.
The only way I can see this passing validation is if after seeing an unconditional branch, we ignore the rest of the block.

Same goes for function drop-break-value-heterogeneous
Where (block (br 0 (i32.const 8)) (br 0 (f64.const 8)) (br 0 (f32.const 8))) cannot be valid if we check all instructions.
Because there is a conflict in the type the br 0s yields.

ghost · 2016-06-23T00:17:33Z

@Cellule Consider (i32.ctz (unreachable)) which is basically the same type case, and this is currently accepted as valid.

ghost · 2016-06-23T00:29:55Z

@Cellule oh, and the second example, if the types conflict then the result is the same as for nop, the void type, which is only a validation error if the value is consumed.

rossberg · 2016-06-23T06:59:41Z

Branches, like all unconditional control flow transfer, are simply polymorphic, i.e., they can have any type, in order to e.g. allow exiting from branches whose siblings do yield some value. Same treatment as in all other typed (expression) languages. I suppose we should have an FAQ for that, as this question keeps coming up. ;)

The operands to branches are implicitly dropped when the block targeted has type void. Thus, their types can differ in such cases -- they are all implicitly converted to void. There has been a long history of back & forth on this (#179, #180, #227, #271), but this arguably surprising freedom is removed with #694.

Cellule · 2016-06-23T19:30:19Z

Thanks,
for the different yield types I think I get it and it should be easy to implement.

On the other hand, I am a little fuzzy as to how to implement (i32.ctz (unreachable))
Do I have to check for unreachable for every operator and let it pass ?
Let's say for instance I have the following

(if 
  (i32.add 
    (i32.ctz (br 0)) 
    (f32.const 1.4)
  )
) (set_local 0 (return 0))

Since br 0 makes it unreachable,
then i32.ctz is now unreachable also (meaning to explicitly check for unreachable)
then is (i32.add (unreachable) (f32)) valid ?

Let's say it is valid or we put an i32.const instead of the float.
then is (if (unreachable) (...)) valid ?

I fail to see why any such things should be considered valid.
Is there a real use case for this ? Correct me if I'm wrong, but I doubt any tools we make would produce such patterns (aka produce dead code in the middle of an expression).
Why not just make unreachable a different type and reject the likes of i32.ctz (unreachable) since i32.ctz needs an i32 and unreachable != i32?

It feels like the equivalent of such a thing in c++ for example would be

{
  if (ctz(goto label1) + 1.4f) {
    int a = return 0;
  }
label1:
}

Which looks really silly.

ghost · 2016-06-23T23:21:53Z

@Cellule The 'unreachable' type does not propagate through typed operators. So the result type of (i32.ctz (br 0)) is i32 even if it is unreachable - an implementation might propagate the type differently for compilation, but it is i32 for validation. (i32.add (unreachable) (f32)) ~~is also valid~~ (is invalid due to the f32 argument - dead code is still validated). (if (unreachable) (...)) is valid, but if you are asking if dead code is still validated then yes it is.

This choice was probably to support top-down typing too.

Perhaps the disucussion in #654 would also be of interest in which is it proposed that the unreachable operator have the same type as the nop operator and that the result can not be consumed, and this removes this type from wasm. This is a useful simplification in an expressionless encoding.

rossberg · 2016-06-24T11:31:34Z

Do I have to check for unreachable for every operator and let it pass ?

Let's say for instance I have the following

(if
(i32.add
(i32.ctz (br 0))
(f32.const 1.4)
)
) (...)

Since br 0 makes it unreachable,
then i32.ctz is now unreachable also (meaning to explicitly check for
unreachable)
then is (i32.add (unreachable) (f32)) valid ?

No, type checking does not consider control flow. Both examples are
ill-typed. I'll add some tests for these case.

Let's say it is valid or we put an i32.const instead of the float.
then is (if (unreachable) (...)) valid ?

Yes, that's valid (though useless). I think there already is a test for
that.

I fail to see why any such things should be considered valid.
Is there a real use case for this ? Correct me if I'm wrong, but I doubt
any tools we make would produce such patterns (aka produce dead code in the
middle of an expression).
Why not just make unreachable a different type and reject the likes of i32.ctz
(unreachable) since i32.ctz needs an i32 and unreachable != i32?

There are infinitely many ways of writing dead or otherwise useless code
with any language construct. ;)

The reason control flow transfer should be polymorphic (instead of, say,
void) is that you want to be able use it in arms of branching constructs
that produce a value, e.g.:

(if (assertion)
  (i32.expression)
  (unreachable)
)

or more interestingly, (the encoding of) switch cases:

(switch (index)  ;; can be encoded with br_table
  case 0: (i32.expr)
  case 1: (i32.expr)
  case 2: (i32.expr)
  case 3: (i32.expr)
  default: (unreachable)
)

If control flow transfer was typed void then you'd be forced to insert dead
dummy values (and sometimes spurious blocks) for these cases, which should
be unnecessary.

IOW, the point of these typing rules is not to prevent dead code, but to
avoid requiring dead code.

FWIW, typing control flow transfer constructs polymorphically is completely
standard in typed expression languages. The most popular example probably
is throw, which is a polymorphic expression in all such languages I'm
aware of.

naturaltransformation · 2016-06-24T20:25:57Z

Since some of these type semantics issues have been confusing multiple people and the only "documentation" of the WASM type system is the spec implementation, would people be interested in a prose explanation of the type system?

naturaltransformation · 2016-06-24T20:26:38Z

The prose can also go into a discussion of the motivation in attempt to clarify things.

jfbastien · 2016-10-08T03:45:35Z

@rossberg-chromium was the design repo ever updated to clarify this? I don't see a commit but may have missed it.

distransient · 2016-10-10T18:21:45Z

I left open #747 on these things

MikeHolman · 2016-10-10T21:44:32Z

@jfbastien You're right, I was a bit hasty in closing this.

rossberg · 2016-10-11T09:17:21Z

@jfbastien, see the description of WebAssembly/spec#345 for the resolution we reached regarding unreachable code. The design repo is generally (and IIUC, somewhat intentionally) very vague about validation, so not sure where specific clarifications would fit in there exactly. I believe we agreed that the written spec will be the appropriate place to lay out the details. But everybody has been busy with catching up to 0xc, so not much progress on that front...

rossberg · 2017-02-07T17:53:54Z

This is old and has been superseded by more recent discussion.

MikeHolman closed this as completed Jun 13, 2016

MikeHolman reopened this Jun 22, 2016

Cellule mentioned this issue Jun 24, 2016

Wasm br/br_table yield chakra-core/ChakraCore#1197

Merged

sunfishcode added the control flow label Jul 8, 2016

sunfishcode modified the milestone: MVP Jul 8, 2016

flagxor added resolve for mvp and removed resolve for mvp labels Jul 19, 2016

MikeHolman closed this as completed Oct 8, 2016

MikeHolman reopened this Oct 10, 2016

rossberg closed this as completed Feb 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type checking and unreachable blocks #707

Type checking and unreachable blocks #707

MikeHolman commented Jun 10, 2016

binji commented Jun 10, 2016

titzer commented Jun 13, 2016

rossberg commented Jun 13, 2016

binji commented Jun 13, 2016

MikeHolman commented Jun 13, 2016

titzer commented Jun 14, 2016

Cellule commented Jun 22, 2016

ghost commented Jun 23, 2016

ghost commented Jun 23, 2016

rossberg commented Jun 23, 2016 •

edited

Loading

Cellule commented Jun 23, 2016 •

edited

Loading

ghost commented Jun 23, 2016 •

edited by ghost

Loading

rossberg commented Jun 24, 2016

naturaltransformation commented Jun 24, 2016

naturaltransformation commented Jun 24, 2016

jfbastien commented Oct 8, 2016

distransient commented Oct 10, 2016

MikeHolman commented Oct 10, 2016

rossberg commented Oct 11, 2016

rossberg commented Feb 7, 2017

Type checking and unreachable blocks #707

Type checking and unreachable blocks #707

Comments

MikeHolman commented Jun 10, 2016

binji commented Jun 10, 2016

titzer commented Jun 13, 2016

rossberg commented Jun 13, 2016

binji commented Jun 13, 2016

MikeHolman commented Jun 13, 2016

titzer commented Jun 14, 2016

Cellule commented Jun 22, 2016

ghost commented Jun 23, 2016

ghost commented Jun 23, 2016

rossberg commented Jun 23, 2016 • edited Loading

Cellule commented Jun 23, 2016 • edited Loading

ghost commented Jun 23, 2016 • edited by ghost Loading

rossberg commented Jun 24, 2016

naturaltransformation commented Jun 24, 2016

naturaltransformation commented Jun 24, 2016

jfbastien commented Oct 8, 2016

distransient commented Oct 10, 2016

MikeHolman commented Oct 10, 2016

rossberg commented Oct 11, 2016

rossberg commented Feb 7, 2017

rossberg commented Jun 23, 2016 •

edited

Loading

Cellule commented Jun 23, 2016 •

edited

Loading

ghost commented Jun 23, 2016 •

edited by ghost

Loading