Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numeric type reform strawperson #33

Open
domenic opened this issue Dec 5, 2014 · 10 comments
Open

Numeric type reform strawperson #33

domenic opened this issue Dec 5, 2014 · 10 comments
Labels
☕☕ difficulty:medium Hard to fix ⌛⌛⌛ duration:long There goes your week-end

Comments

@domenic
Copy link
Member

domenic commented Dec 5, 2014

Inspired by bug 26901 and @marcoscaceres's #14, I somehow ended up writing a detailed proposal for reforming the numeric types.

WebIDL Numeric Type Reform Strawperson

Motivation

The existing WebIDL numeric types are problematic in a number of ways.

First, they create a misleading parallel with numeric types in other languages, where there is actually a proper numeric type system. This encourages people to use them in situations where they might in other languages, creating non-JavaScript-idiomatic APIs. Most JavaScript APIs will want one of a few things: any numeric value at all; any finite numeric value (possibly restricted to integers); or possibly more complicated validation or clamping logic within a given range. WebIDL does provide some facilities for the latter, but only if your range is based on powers of two; ranges like 0 to 100, -180 to +180, or similar are not supported and require prose.

Second, they have the usual WebIDL problem of using the type system for two different purposes: coercions from JavaScript values, as in the cases of parameter lists and dictionaries, and documentation, as in the case of return values and constants. For the documentation cases, the proliferation of numeric types is simply confusing; declaring a return type as an unsigned long is meaningless given that it will be exposed as a normal JavaScript number (i.e. double-precision floating point value). It would be better to use a generic number type for those purposes, and create types that emphasize the potential coercion or validation strategies for use in those scenarios.

Finally, the spec is bloated with repetitive and spread-out text for performing the different numeric type coercions. The attributes [EnforceRange] and [Clamp] modifying the steps so that the types come to have drastically different behaviors when they are applied.

My proposed solution is to try to give tools that are both more aligned with spec-authoring use cases and clearer about what they are accomplishing. The complete functionality of the current types is preserved, and in fact further use cases (such as custom ranges) are enabled.

The proposal consists of removing all existing numeric types, as well as the [Clamp] and [EnforceRange] extended attributes, and replacing them with:

  • A new number type, along with a [EnforceFinite] extended attribute that can apply to it, for documentation cases and for the most generic floating-point number processing.
  • Two new parametrized types, IntEnforceWithin<x, y> and IntClampRound<x, y>, which are expected to be broadly useful for defining integer inputs that must stay within a specific range.
  • Two new parametrized types, IntMod<x, y> and UintMod<y>, which are expected to be less useful and are mainly kept to enable matching legacy semantics.
  • A to-be-determined set of typedefs to make the common cases that appear across specs more convenient.

Removal

We are trying to reform the following parts of WebIDL. So first, picture a universe without:

All existing numberic types

  • byte
  • octet
  • short
  • unsigned short
  • long
  • unsigned long
  • long long
  • unsigned long long
  • float
  • unrestricted float
  • double
  • unrestricted double

The numeric modifier extended attributes

  • [EnforceRange]
  • [Clamp]

Additions

Now let's get that funcitonality back, but better/faster/stronger.

number type and [EnforceFinite] extended attribute

The number type is the type that corresponds most closely to a JavaScript number. It is to be used:

  • By all places that require no coercion, e.g. return types or constants
  • For parameters, dictionary entries, etc. that do not need to enforce any particular behavior, but simply want to run ToNumber.

The [EnforceFinite] extended attribute is used to disallow NaN, +∞, or −∞.

The algorithm for converting an ES value v to a WebIDL number is:

  1. Let n be ToNumber(v).
  2. If the conversion to an IDL value is being performed in the presence of a [EnforceFinite] extended attribute, then
    1. If n is NaN, +∞, or −∞, throw a RangeError.
  3. Return n.

This can express the following old patterns:

double              →  [EnforceFinite] number
unrestricted double →  number

TODO: where are the existing float and unrestricted float types used? How do we express those? What are their semantics anyway?

More-useful parametrized coercion types

Several new parametrized types are introduced specifically for use in places that require coercions (parameter lists, dictionary entries, setters). They are expected to be broadly useful. Unlike the previous types, they are not restricted to a predefined set of upper and lower limits. After all, it seems arbitrary to imagine that [-2147483648, 2147483647] is a more useful range than [0, 100] or [0, 360].

IntEnforceWithin<x, y>

Similar to today's [EnforceRange]. The algorithm for converting an ES value v to a WebIDL IntWithin<x, y> is:

  1. Let n be ToNumber(v).
  2. If n is NaN, +∞, or −∞, throw a RangeError.
  3. Set n to sign(n) * floor(abs(n)).
  4. If n < x or n > y, throw a RangeError.
  5. Return n.

This can express the following old patterns:

[EnforceRange] byte               →  IntEnforceWithin<-128, 127>
[EnforceRange] octet              →  IntEnforceWithin<0, 255>
[EnforceRange] short              →  IntEnforceWithin<-32768, 32767>
[EnforceRange] unsigned short     →  IntEnforceWithin<0, 65535>
[EnforceRange] long               →  IntEnforceWithin<-2147483648, 2147483647>
[EnforceRange] unsigned long      →  IntEnforceWithin<0, 4294967296>
[EnforceRange] long long          →  IntEnforceWithin<-9007199254740991, 9007199254740991>
[EnforceRange] unsigned long long →  IntEnforceWithin<0, 9007199254740991>

IntClampRound<x, y>

Similar to today's [Clamp]. The algorithm for converting an ES value v to a WebIDL IntClamp<x, y> is:

  1. Let n be ToNumber(v).
  2. Set n to min(max(n, x), y).
  3. Round n to the nearest integer, choosing the even integer if it lies halfway between two, and choosing +0 rather than −0.
  4. Return n.

This can express the following old patterns:

[Clamp] byte               →  IntClampRound<-128, 127>
[Clamp] octet              →  IntClampRound<0, 255>
[Clamp] short              →  IntClampRound<-32768, 32767>
[Clamp] unsigned short     →  IntClampRound<0, 65535>
[Clamp] long               →  IntClampRound<-2147483648, 2147483647>
[Clamp] unsigned long      →  IntClampRound<0, 4294967296>
[Clamp] long long          →  IntClampRound<-9007199254740991, 9007199254740991>
[Clamp] unsigned long long →  IntClampRound<0, 9007199254740991>

Less-useful parametrized coercion types

Several parametrized coercion types are introduced simply to be able to maintain old semantics. They should probably not be used.

IntMod<x, y>

Used for emulating today's signed types. The algorithm for converting an ES value v to a WebIDL IntMod<x, y> is:

  1. If n is NaN, +0, −0, +∞, or −∞, return +0.
  2. Set n to sign(n) * floor(abs(n)).
  3. Set n to n modulo x.
  4. If n is ≥ y, set n to n - x.
  5. Return n.

This can express the following old patterns:

byte      →  IntMod<256, 128>
short     →  IntMod<65536, 32768>
long      →  IntMod<4294967296, 2147483648>
long long →  IntMod<18446744073709552000, 9223372036854776000>

UintMod<x>

Used for emulating today's unsigned types. The algorithm for converting an ES value v to a WebIDL UintMod<x> is:

  1. If n is NaN, +0, −0, +∞, or −∞, return +0.
  2. Set n to sign(n) * floor(abs(n)).
  3. Return n modulo x.

This can express the following old patterns:

octet              →  UintMod<256>
unsigned short     →  UintMod<32768>
unsigned long      →  UintMod<2147483648>
unsigned long long →  UintMod<9223372036854776000>

UintMod<360> might also be useful for any functions that want to process degrees.

Making these more convenient

Typedefs

We should do an audit to find what range coericons are commonly used on web specs, and define typedefs for them. My hope is that e.g. long long is used infrequently and mistakenly, so we wouldn't need a typedef of that sort; the spec could define its own, or use the awkward long name anyway. @bzbarsky says that at a cursory glance byte is unused. Etc.

We should almost certainly define a simple integer typedef. Unsure which is most idiomatic, but it should go up to 253 - 1. We would also likely want one for 0–255.

Names for these typedefs are undecided. They could use the existing names, although I am wary that these names give the mistaken impression of some correlation to a real type system.

Allowing power notation

Although again I hope that people aren't using the random-power-of-two ranges very often in their parameter coercions, if they are, we could allow e.g. UintMod<2^64> or UintMod<2**64> to replace UintMod<9223372036854776000>.

Shorter, less-precise names for the more-useful ones?

E.g. instead of IntEnforceWithin<x, y> we could do IntWithin<x, y> or even Int<x, y>. Instead of IntClampRound<x, y> we could do IntClamp<x, y>. Or we could get really cryptic with e.g. Int!<x, y> vs. Int_<x, y>.

@bzbarsky
Copy link
Collaborator

bzbarsky commented Dec 5, 2014

declaring a return type as an unsigned long is meaningless

Not entirely.

With my browser implementor hat on, annotating a return type with a restricted integer range allows the JIT to optimize based on that information. So I would expect that the IDL browsers actually use would end up with just such annotations no matter what.

A new number type, along with a [EnforceFinite] extended attribute

One imporant question is whether EnforceFinite should be opt-in or opt-out. Right now it's opt-out. You're proposing making it opt-in, but in practice as far as I can tell the common case is specifications not wanting to deal with non-finite numbers.

My hope is that e.g. long long is used infrequently and mistakenly,

The uses I see in Gecko's IDL in standard stuff are:

  1. Blob.slice, for start and end args. The idea is to allow slicing from the end by passing negative values and to allow blobs larger than 4GB, hence "long long".

  2. File.lastModified. This has to be long long to not run out in a few years and needs to be signed because the Unix epoch is just not that long ago.

  3. In typedef form GLIntptr and GLsizeiptr are all over WebGL; these represent offsets within possibly-large buffers.

  4. GLint64 in WebGL2, for timeout values.

@domenic
Copy link
Member Author

domenic commented Dec 5, 2014

So I would expect that the IDL browsers actually use would end up with just such annotations no matter what.

Yeah, that's fair, but it does feel like an implementation detail ... e.g. for a self-hosted implementation it wouldn't be useful.

In any case, it seems non-normative.

One imporant question is whether EnforceFinite should be opt-in or opt-out. Right now it's opt-out. You're proposing making it opt-in

True. I did this basically because I wanted number to be the normal JS number type, not because of any informed thoughts about the common case. Maybe fnumber to make it convenient? finite? Or just abandon the attempt to conform to the JS type directly and let number be finite-enforced?

The uses I see in Gecko's IDL in standard stuff are:

It seems to me all of these could be replaced by some unrestricted integer type. (The one I vaguely allude to with "We should almost certainly define a simple integer typedef. Unsure which is most idiomatic, but it should go up to 2^53 - 1.") That would be a change in behavior, and so likely not compatible (or at least not worth trying to change for so little gain). But I don't see anything in your arguments that implies the correct behavior for these parameters is to mod by 2^64 then potentially subtract 2^63 after flooring.

So indeed for cases like these I'd be happy for the specs to use IntMod<18446744073709552000, 9223372036854776000> or some typedef we define for it, either in WebIDL or in the relevant specs.

@bzbarsky
Copy link
Collaborator

bzbarsky commented Dec 5, 2014

e.g. for a self-hosted implementation it wouldn't be useful.

It actually would be. If the type information is available statically and can be trusted (this last point is key), then you can optimize better than you can based on dynamic type information.

In any case, it seems non-normative.

This brings us back to what the point of Web IDL is, to some extent. But yes.

It seems to me all of these could be replaced by some unrestricted integer type.

The GL ones might not be able to be, depending, because they get passed down to GL drivers that in fact do deal in machine integers.

But yes, the actual coercion involved to produce a 64-bit integer in these cases is silly right now; they should probably all be [EnforceRange] and doing it with 53-bit binary integers would probably be fine as well.

@heycam heycam closed this as completed in 3698ea6 Feb 3, 2015
@domenic
Copy link
Member Author

domenic commented Feb 3, 2015

Accidentally closed, I think :)

@heycam heycam reopened this Feb 3, 2015
@heycam
Copy link
Collaborator

heycam commented Feb 3, 2015

Mis-typed the issue number. :)

@littledan
Copy link
Collaborator

I like the ideas expressed here.

I'm starting to look into including 64-bit ints in ES. I have an early draft here http://littledan.github.io/int64.html . I'm not sure when, if ever, this will make it into the language, but if it does, WebIDL integration is something I'm interested in figuring out. (However, I don't know of any use cases in web APIs for such big integers.)

I don't think 64-bit ints change anything about this proposal, though. long long is already a Number, and it should probably stay that way.

@domenic
Copy link
Member Author

domenic commented Jul 14, 2015

(However, I don't know of any use cases in web APIs for such big integers.)

IIRC some web crypto APIs end up representing 64 bit ints as ArrayBuffers, and could have benefited from native 64 bit int support. But, then again, I remember being convinced that ArrayBuffers was a good idea. Maybe the API was more "a sequence of bytes which could also in theory be interpreted as a 64 bit integer" or something.

@annevk
Copy link
Member

annevk commented Sep 27, 2016

Note that per https://www.w3.org/Bugs/Public/show_bug.cgi?id=28834 and https://lists.w3.org/Archives/Public/public-script-coord/2016JulSep/0037.html [EnforceRange] might have to be a type, otherwise you cannot use it in unions and such.

@domenic
Copy link
Member Author

domenic commented Feb 27, 2019

I was reminded of this old proposal of mine. I think the OP is overambitious and naive, not to mention very long, both in proposal length and in what it makes Web IDL users type. These days I might go for something like

  • number (today's unrestricted double)
  • fnumber (today's double)
  • int, or maybe int53 (today's long long)
  • uint, or maybe uint53 (today's unsigned long long)
  • Add [Mod=x].
    • Unsure if x should be 16 or 32768. Maybe [Modp] or something for powers of 2.
    • Unsure if we should add int32 as an alias for [Modp=32] int or just tell people to use the ugly-ish extended attribute if they insist on not using int. Maybe int32 is a special-enough case that we want an alias, but we make people write out [Modp=16] int instead of giving them short.
  • Keep [EnforceRange] and [Clamp] functionality, somehow.

@annevk
Copy link
Member

annevk commented Feb 28, 2019

Note that some specifications will likely typedef whatever we do: https://www.khronos.org/registry/webgl/specs/latest/1.0/#5.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
☕☕ difficulty:medium Hard to fix ⌛⌛⌛ duration:long There goes your week-end
Development

No branches or pull requests

6 participants