Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC FS-1148 - Ease conversion between Units of Measure (UoM) and undecorated numerals and simplify casting #784

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

roboz0r
Copy link

@roboz0r roboz0r commented Aug 10, 2024

Click “Files changed” → “⋯” → “View file” for the rendered RFC.

@roboz0r roboz0r marked this pull request as ready for review August 10, 2024 23:37
- `ResizeArray`
- `Seq`

Total added methods is 195 `13 supported primitives * 3 methods * (primitive + 4 collection types)`. They are presented as 15 methods with 13 overloads each.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my personal concern - it's a lot of new surface area and probably a bunch of sigdata, optdata and il.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental size is around 31KB which translates to roughly 163 bytes per method. I don't really have a have a sense of whether that is acceptable or not. If the majority of F# users aren't making extensive use of UoM then there's no sense in making them carry around any extra weight.

The part of the feature that I sense would answer the most questions for users is removal of units from primitives. If I can call LanguagePrimitives.FloatWithMeasure, I'd expect the inverse LanguagePrimitives.FloatWithoutMeasure, although the same effect is currently available with the existing primitive conversion methods like float.

Maybe the collections are better off in a FSharp.Collections.Units or as an addition to FSharp.UMX that has an essentially identical API using overloaded methods for primitives extended to non-numeric types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@T-Gro I'd be happy to accept this PR for an RFC - could you review and if you approve merge? thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the collections are better off in a FSharp.Collections.Units or as an addition to FSharp.UMX that has an essentially identical API using overloaded methods for primitives extended to non-numeric types?

I agree, move collections from this RFC and keep it tight. Value of collection functions is much smaller as a whole than the primative functions, even in the limit as the number of collections tends to infinity.

@vzarytovskii
Copy link
Contributor

In general, currently, I believe that in this form (method per type per action, etc) it should live in a separate library (be part of UMX)? However, I generally think that this particular functionality (add, strip measure) can and should be solved generically (i.e. we should have one type directed generic function stripMeasure<_> and one addMeasure<_> which will do the magic), though it might require some work in compiler.

@roboz0r
Copy link
Author

roboz0r commented Aug 12, 2024

Based on Don's comments

(2) and (3) [generic UoM conversion] require 'T<'m> as a construct, or a new kind of constraint, or a special type-checking rule. Thus they look complex.

I took "complex" to mean something I wouldn't be capable of as a novice contributor. Do you have a sense of how complex and pervasive such an addition would be?

namespace Microsoft.FSharp.Core


type Units =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of units of measure (including in F#) is to add safety when doing arithmetic and interfacing between systems which may have different conventions. Sometimes measures need to be removed or added when interfacing with systems that do not understand measures. This is inevitable and these are always potential points of failure. To preserve safety here we must have [<RequiresExplicitTypeArguments>] on all of these. It is then explicit in the code what measure is being added or removed. Without this, the wrong measure (e.g. cm instead of m) may get added or removed by mistake, leading to bugs (e.g. numbers wrong by a factor of 100).

@brianrourkeboll
Copy link
Contributor

brianrourkeboll commented Jan 24, 2025

What about something like the following (constraints omitted for clarity):

[<RequireQualifiedAccess>]
module Measure =
    /// Tags a value with a unit of measure.
    val inline tag : value:'T -> '``T<'Measure>``

    /// Removes a unit of measure from a value.
    val inline untag : value:'``T<'Measure>`` -> 'T

    /// Tags a value with a new unit of measure.
    val inline retag : value:'``T<'Measure1>`` -> '``T<'Measure2>``

    /// Maps a value with a unit of measure to another value with a unit of measure.
    val inline map : mapping:('T -> 'U) -> value:'``T<'Measure1>`` -> '``U<'Measure2>``
// Tag.
let _ : int<m> = Measure.tag 1

// Untag.
let _ : int = Measure.untag 1<m>

// Retag.
let _ : int<foot> = Measure.retag 1<m>

// Map.
let _ : float<m> = 1<m> |> Measure.map float

(Source)

It would be trivial to extend to collections if we wanted (at the expense of more overloads).

It might be possible to plumb through explicit type parameters if we wanted as well, as suggested in #784 (comment) (maybe someone like @gusty would know how feasible that is).

Advantages

  • I think Measure is a more discoverable name than Units, since [<Measure>] is used when defining measure types.
  • I think a module with generic functions is nicer and more F#-y than overloaded methods (even though this approach also uses overloads under the hood and could be subject to similar overload resolution corner-cases).

(Although in general I would agree with @vzarytovskii's comment #784 (comment) that it would be nicer if this could be in the compiler somehow. Maybe I will think about it.)

@roboz0r
Copy link
Author

roboz0r commented Jan 25, 2025

100% agree with @brianrourkeboll that Measure is a better name for the type/module than Units and a generic function does look more F#-y.

The only downside I see to this approach is on the tooling side examining the function you'll see
image

So you don't know what types support UoM without first providing an incorrect value and looking at the compiler error (which is quite clear to be fair)

No overloads match for method 'Tag'.

Known return type: 'a

Known type parameter: < string >

Available overloads:
 - static member Tag.Tag: x: byte -> byte<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: decimal -> decimal<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: float -> float<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: float32 -> float32<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: int -> int<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: int16 -> int16<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: int64 -> int64<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: nativeint -> nativeint<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: sbyte -> sbyte<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: uint -> uint<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: uint16 -> uint16<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: uint64 -> uint64<'Measure> // Argument 'x' doesn't match
 - static member Tag.Tag: x: unativeint -> unativeint<'Measure> // Argument 'x' doesn't match

Apologies for letting this PR slip under my radar. I got busy then forgot to get back to it. I'd be happy to have another shot at the changes or look into making 'T<'Measure> a representable type in the compiler with some guidance.

@brianrourkeboll
Copy link
Contributor

the compiler error (which is quite clear to be fair)

I updated my gist so that the error message in that scenario looks a bit better (it will now show Measure.Tag, etc.).

But yes, if we could smooth this out a bit in the compiler instead, perhaps similarly to the other functions that rely on "implicitly augmented" members, maybe that would be nicer.

static member inline Remove<[<Measure>]'Measure>(input: float<'Measure>):float = retype input
...
static member inline Cast<[<Measure>]'MeasureIn, [<Measure>]'MeasureOut>(input: byte<'MeasureIn>):byte<'MeasureOut> = retype input
static member inline Cast<[<Measure>]'MeasureIn, [<Measure>]'MeasureOut>(input: float<'MeasureIn>):float<'MeasureOut> = retype input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use "1" as input or output unit. Doing this lets you divide the surface area by 3.

open FSharp.Data.UnitSystems.SI.UnitSymbols
let convert<[<Measure>] 'a, [<Measure>] 'b> (a: float<'a>) =
    (# "" a: float<'b> #)
1. |> convert<1, m> |> convert<m, kg> |> convert<kg, 1> |> printfn "%g"

https://sharplab.io/#v2:DYLgZgzgNALiBOBXAdlAJiA1AHwPYAcBTZAAgDEBlACwEN58A6AERphoYFVkBLGCgTwgxCAWwgMKASU48+/EQCNcwCAFgAUMEIwSAY1zIAboXgwAPAG0zAWUI0IieIQB8AXRIByGlBJXb9xxd3DwVnEgAKGhASMGBcVjMvZwBKEgBeDRIsiIBiEgAifJIomLiEkLCc5I0ARgYSbDD9IxNzGp8RMMa9A2NTMxEfAGsAcy6m3tazUZ8a8ZJ8eG5kGDBSfIBSEfygA=

@T-Gro
Copy link
Contributor

T-Gro commented Feb 4, 2025

Listing a few resolved+open questions I can see around:

  • Agree on module naming, Units vs Measure => Measure
  • Agree on static members vs. inlined functions w/ SRTPs underneath => Module with inlined functions.
  • Agree on the scope - scalar tag,untag,retag,map
  • Scope for collections - zero cost remapping of collections - YES/NO ?
  • Make expected size increase clear. How would it stand with @brianrourkeboll proposal if collections were involved as well?

If the last two points are resolved, I think a library approach might eventually fly - that is, if the number of added members can be kept at a reasonable level.
Solving this via a new compiler typesystem feature would require bigger changes, and afterall this feature might be its sole beneficient.

@brianrourkeboll
Copy link
Contributor

brianrourkeboll commented Feb 4, 2025

How would it stand with @brianrourkeboll proposal if collections were involved as well?

I think supporting collections using that technique would require adding methods in the amount of the number of supported measurable types × the number of supported collections × 2 (tag + untag). Example.

(Side note: those could also go into separate types per collection, and then collection-specific functions could be added to the respective collection modules, e.g., Array.tag, etc.)

@T-Gro
Copy link
Contributor

T-Gro commented Feb 6, 2025

(Side note: those could also go into separate types per collection, and then collection-specific functions could be added to the respective collection modules, e.g., Array.tag, etc.)

The naming should involve Measure somewhere .
(to me, Array.tag on its own does not make it apparent (outside of a module context) to be UoM relevant. The only common language usage of 'tags' is within union types, which could make this confusing)

Array.addMeasure,
Measure.addToArray.

I am also thinking about a middle ground between compiler vs library implementation: Solve the collection mapping via a new optimizer feature, to make Array/Seq/List.map (Measure.tag<kg/l>) a no-op.

@smoothdeveloper
Copy link
Contributor

Similar to what was discussed, but somehow dismissed in the random* functions, should we consider additions to FSharp.Core to rather be a separate assembly, and maybe not part of prelude?

Having the infrastructure done seems to be necessary, otherwise we can complain about increased footprint/binary distribution of F#, but there will always be pressure to add to the standard set of features of FSharp.Core.

@T-Gro
Copy link
Contributor

T-Gro commented Feb 6, 2025

Similar to what was discussed, but somehow dismissed in the random* functions, should we consider additions to FSharp.Core to rather be a separate assembly, and maybe not part of prelude?

Having the infrastructure done seems to be necessary, otherwise we can complain about increased footprint/binary distribution of F#, but there will always be pressure to add to the standard set of features of FSharp.Core.

Any feedback and ideas about it can go to dotnet/fsharp#17496 . It is a relevant and important (especially for longer term evolution of FSharp.Core, also related to supported TFMs and binary size) concept, but orthogonal to individual functional RFCs.
If we see it as helpful, we can keep remarks at library RFCs like this one about its coupling to the compiler and rest of the library.

@vzarytovskii
Copy link
Contributor

Similar to what was discussed, but somehow dismissed in the random* functions, should we consider additions to FSharp.Core to rather be a separate assembly, and maybe not part of prelude?

Having the infrastructure done seems to be necessary, otherwise we can complain about increased footprint/binary distribution of F#, but there will always be pressure to add to the standard set of features of FSharp.Core.

Fwiw, as I said it before, if we are not willing to make it a generic feature of the compiler, and want to make a bunch of functions instead, I would say it shouldn't be a part of fslib as its application is a bit niche.

@T-Gro
Copy link
Contributor

T-Gro commented Feb 12, 2025

Fwiw, as I said it before, if we are not willing to make it a generic feature of the compiler, and want to make a bunch of functions instead, I would say it shouldn't be a part of fslib as its application is a bit niche.

The functions for scalar values are small and a new type-system-level feature will likely not justify it.
It is the collections functions which are making the size 4x bigger (scalar,list,array,seq).

What about adding only the few scalar functions into the library, and solving collections in the optimizer by detecting List.map Measure.tag<..> and doing a no-op retyping within the compiler?

That way, we would have a very small increase of fslib surface (4x smaller then originally proposed) and still get a no-op mechanism to transform collections.

WDYT?

@vzarytovskii , @brianrourkeboll , @roboz0r

@vzarytovskii
Copy link
Contributor

This can be the solution. I'm not a huge fan of making fslib functions an intrinsic, we should probably outline how that would work in details, so we're on the same page.

@roboz0r
Copy link
Author

roboz0r commented Feb 12, 2025

detecting List.map Measure.tag<..> and doing a no-op retyping

What would be the implications for this with Array.map? I think it would always be fine for lists since it's already immutable and UoM are only applied to immutable values but with an array the user may be expecting that map always makes a copy and using both arrays in future code.

I think @brianrourkeboll gist looks like roughly the right final API and happy to change the PR with/without collections to match that while adding [<RequiresExplicitTypeArguments>].

[<RequireQualifiedAccess>]
module Measure =
    /// Tags a value with a unit of measure.
    let inline tag (value : 'T) : '``T<'Measure>`` = ...

    /// Removes a unit of measure from a value.
    let inline untag (value : '``T<'Measure>``) : 'T = ...

    /// Tags a value with a new unit of measure.
    let inline retag (value : '``T<'Measure1>``) : '``T<'Measure2>`` = ...

    /// Maps a value with a unit of measure to another value with a unit of measure.
    let inline map ([<InlineIfLambda>] mapping : 'T -> 'U) (value : '``T<'Measure1>``) : '``U<'Measure2>`` = ...

If collections are excluded for now on size grounds, would this API prevent a future addition where 'T<'Measure> becomes a real type and we'd get the collection conversions without adding a function per type?

@T-Gro
Copy link
Contributor

T-Gro commented Feb 12, 2025

If collections are excluded for now on size grounds, would this API prevent a future addition where 'T<'Measure> becomes a real type and we'd get the collection conversions without adding a function per type?

Not prevent it, for sure.

What would be the implications for this with Array.map? I think it would always be fine for lists since it's already immutable and UoM are only applied to immutable values but with an array the user may be expecting that map always makes a copy and using both arrays in future code.

It could still be optimized when used in pipelines, when we can guarantee that the binding for array before the call is out of scope.

@brianrourkeboll
Copy link
Contributor

@T-Gro

What about adding only the few scalar functions into the library, and solving collections in the optimizer by detecting List.map Measure.tag<..> and doing a no-op retyping within the compiler?

Hmm. I think I'm mostly with @vzarytovskii on this. On the one hand, the compiler does give things from FSharp.Core special treatment all the time. On the other hand, I'm not aware of any existing examples of the compiler optimizing away whole $O(n)$ function calls like List.map.

I would also have lots of questions about that. Would we special-case exactly those functions with exactly that syntactic form of invocation? What about scenarios like these?

xs |> List.map Measure.tag // OK.
xs |> List.map (fun x -> Measure.tag x) // OK?
xs |> List.map (fun x -> let x = x in Measure.tag x) // OK?
xs |> List.map (fun x -> let x = x + 1 in Measure.tag x) // Nope.
xs |> List.map (fun x -> printfn $"{x}"; Measure.tag x) // Nope.

let myTag<[<Measure>] 'Measure> (x : int) : int<'Measure> = LanguagePrimitives.Int32WithMeasure<'Measure> x
xs |> List.map myTag<m> // OK??

let myTag2<[<Measure>] 'Measure> (x : int) : int<'Measure> = printfn $"{x}"; LanguagePrimitives.Int32WithMeasure<'Measure> x
xs |> List.map myTag2<m> // Nope.

@roboz0r

while adding [<RequiresExplicitTypeArguments>]

I don't think you'll be able to do that with the code from my gist as-is. It may be possible, but I suspect that it will be rather difficult to do, since it will require spelling out all of the types, including intermediate types, and their constraints.

Even if it is possible, I suspect it would need to be something like

type Measure =
    static member inline Tag module Measure =
    [<RequiresExplicitTypeArguments>]
    let inline tag<'T, [<Measure>] 'Measure, '``T<'Measure>`` when (Measure or 'T) : (static member Tag :)> =

And then you'd need to explicitly specify all types at the callsite, not just the measure type:

let x = Measure.tag<int, m, int<m>> 3

The only way you could avoid that while still requiring explicit type arguments would be to use method overloads generic only in the measure type (static member Measure.Tag<[<Measure>] 'Measure> : value:int -> int<'Measure>, etc.) directly, as in your original PR. (Or, I guess, adding a 'T<'Measure> feature to the compiler.)

Separately: I have not had a chance to play around with this thought yet, but I wonder whether using static optimization conditionals (like for abs, sign, (..), etc.) would give us any benefit here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants