The 'string' operator produces suboptimal code when the object's type is known. #9153

teo-tsirpanis · 2020-05-08T11:23:49Z

This simple code snippet:

let f() = string 4

produces this IL code that involves a redundant boxing and type checking:

public static string f()
{
    object obj = 4;
    if (obj != null)
    {
        IFormattable formattable = obj as IFormattable;
        if (formattable != null)
        {
            IFormattable formattable2 = formattable;
            return formattable2.ToString(null, CultureInfo.InvariantCulture);
        }
        object obj2 = obj;
        return obj2.ToString();
    }
    return "";
}

The compiler knows that 4 is an integer and implements IFormattable. For value types and sealed classes known to implement it, emitting a direct call to ToString with the invariant culture passed (and the constrained. IL prefix) would be much better. Same with the eligible types that are known to not implement it.

For all other cases (non-sealed reference types that do not implement IFormattable), the existing behavior is acceptable (no boxings) and backwards-compatible. Performance-concerned developers can always directly call ToString().

There might be a problem with types that might stop implementing IFormattable at a later version of the library, but this is a breaking change that should require recompilation and we should not be concerned with such scenarios.

The text was updated successfully, but these errors were encountered:

charlesroddie · 2020-05-10T08:30:52Z

Why not the simpler approach: to compile string x where x has type 'a.

If 'a is IFormattable then compile to (x:>IFormattable).ToString(null, CultureInfo.InvariantCulture).
Otherwise compile to x.ToString().

If 'a:null (NRT doc) then a null check will also be needed to preserve existing behaviour.

teo-tsirpanis · 2020-05-10T08:49:54Z

@charlesroddie it might be possible that a subclass of 'a implements IFormattable while 'a itself does not. That's why I restricted the optimization you said to only value types or sealed classes: it is certain whether they implement it or not. This behavior preserves backwards compatibility, but I have no problem breaking it here. And keep in mind that types like F# records and DUs are automatically sealed. And you can always directly call ToString().

charlesroddie · 2020-05-10T09:14:15Z

If x:'b where b:>'a, and 'b implements IFormattable and 'a does not, then string x should use IFormattable and string(x:>'a) should not. I understand that with all this casting to obj you may be right about the current behaviour. But it seems like a bug to me.

This would not make string a well-behaved function, but it would move it in that direction. (If F# allowed overloading functions and had not in type constraints you could define it as a function in F# with an IFormattable overload and a not IFormattable overload.)

abelbraaksma · 2020-05-10T12:01:28Z

@charlesroddie, your proposal is pretty much what @dsyme and I researched a few months ago, see #7958 (I forgot about creating the pr, this thread reminded me again).

Though the reasons to want to change string are different: my motivation was that presently it cannot always be used in (sometimes very trivial) code. Bottom line in that issue is: get rid of "inline" so that the dreaded "cannot escape its scope" error doesn't hit us anymore.

charlesroddie · 2020-05-10T12:53:10Z

Existing code from #7958 (comment)

let inline anyToString nullStr x = 
    match box x with 
    | null -> nullStr
    | :? System.IFormattable as f -> f.ToString(null,System.Globalization.CultureInfo.InvariantCulture)
    | obj ->  obj.ToString()

let inline string (value: ^T) = 
     anyToString "" value
     // since we have static optimization conditionals for ints below, we need to special-case Enums.
     // This way we'll print their symbolic value, as opposed to their integral one (Eg., "A", rather than "1")
     when ^T struct = anyToString "" value
     ...
     when ^T : int32      = (# "" value : int32      #).ToString("g",CultureInfo.InvariantCulture)
     ...

I don't understand the compiler-specific syntax. But the intention seems to be to use the code from lowest down case as the most specific case. So string 4 should give 4.ToString("g",CultureInfo.InvariantCulture) But this issue implies that anyToString is used instead.

Your suggestion @abelbraaksma removes the static attempts completely. That would entrench the problem in this issue, wouldn't it? I think using anyToString at any time is an inefficiency so we should eliminate use of this function if possible. I don't know if something like this is possible:

let inline string (value: ^T) = 
    value.ToString()
    when ^T struct = anyToString "" value
    when ^T : null = match (# "" value : ^T      #) with | null -> "" | NonNull v -> v.ToString()
    when ^T :> IFormattable = (# "" value : IFormattable      #).ToString(CultureInfo.InvariantCulture)

abelbraaksma · 2020-05-10T13:08:37Z

@charlesroddie, the compiler-specific syntax is used to create specific compile paths for certain statically known types, and emit different IL for those. It's a bit more powerful than plain SRTP (and personally, I'd love it to be available to the general public, but that's another issue).

The special-casing is, however, redundant. Each path creates a cast to System.IFormattable (not visible here, that's part of the "g" modifier for ToString). Ergo, the boxing is inevitable, since on the "unknown" path we already specifically check for System.IFormattable anyway, and ToString does it for the known paths.

The proposed simpler code removes this redundant compile time checking, and removes inline, leading to a function string that can be used in all circumstances, without loss of performance (though it should be noted that ToString is very slow, and losing one boxing or cast wouldn't really make a measurable difference).

abelbraaksma · 2020-05-10T13:11:24Z

Quoting myself (#7958 (comment)):

The overload ToString(IFormatProvider), i.e. without the "g", calls Number.FormatDouble(...) with null internally for the format string. This then defaults to "g".

abelbraaksma · 2020-05-10T13:21:15Z

I think using anyToString at any time is an inefficiency so we should eliminate use of this function if possible. I don't know if something like this is possible:

@charlesroddie I don't think it's inefficient, the call should be eliminated by the JIT, and certainly in my proposed code.

Your suggested code makes sense (if the syntax allows it), but it doesn't eliminate any boxing, and it still requires inline. Also, the statically unknown type test should probably be added, including a runtime test for System.IFormattable. Ultimately, all these checks are redundant, and simpler, and without inline is better.

The proposed simplified code would become (#7958 (comment)):

let anyToString nullStr x = 
    match box x with 
    | null -> nullStr
    | :? System.IFormattable as f -> f.ToString(null,System.Globalization.CultureInfo.InvariantCulture)
    | obj ->  obj.ToString()

let string value = 
    anyToString "" value

abelbraaksma · 2020-06-23T20:02:26Z

I've researched this a bit further for an up and coming PR. It turns out that the original code contained "dead code", anything after the struct test is never reached. These are static type tests done during compilation, but since when ^T struct = anyToString "" value comes first, the special paths for int etc are redundant.

This means that the current code is already equal to let string x = anyToString "" value after compiling, which, since it is inlined, is just the same as anyToString.

This made we wonder about the double null-check as originally reported here by @teo-tsirpanis. It turns out that writing this:

match box x with
| null -> do something
| :? ISomething as foo -> foo.CallInterface()

already leads to the double null-check:

The first | null -> ... is compiled into if x != null then...
Then the type-test is done
Then another null-check is done, a type-test is like as in C#
Then the call on the interface is done

Since this means that you don't need to check for null when you do a match-with-type-test, I wondered if we could see anything if we put the null check at the bottom instead, like this:

let inline anyToString nullStr x = 
    match box x with 
    | :? System.IFormattable as f -> f.ToString(null,System.Globalization.CultureInfo.InvariantCulture)
    | null -> nullStr
    | obj ->  obj.ToString()

let inline string value = 
    anyToString "" value

This leads to the following decompiled C# code:

public string newString1()
{
	object obj = MixString.get(1);   // this is from my test-setup
	object obj2 = obj;
	IFormattable formattable = obj2 as IFormattable;
	if (formattable != null)   // only one null-test now
	{
		IFormattable formattable2 = formattable;
		return formattable2.ToString(null, CultureInfo.InvariantCulture);
	}
	if (obj2 != null)   // and another one if it is not IFormattable
	{
		object obj3 = obj2;
		return obj3.ToString();
	}
	return "";
}

I was wondering if this would lead to any measurable performance gain or loss. I set the max-relative-error extra tight, but still there's no measurable difference, the timings are all inside the error range, and different runs give different results.

First run:

Second run:

Since the inlining of string leads to it not being usable inside virtual methods with generics, and since the timings show now significant downside of removing inline here, I propose to make it a normal generic function.

Example of what's currently impossible:

type Du<'T> = 
    | Foo of 'T 
    override this.ToString() =   // FS0670 error shown here
        match this with
        | Foo x -> string x    // caused by this

teo-tsirpanis · 2020-06-23T21:15:36Z

Since the inlining of string leads to it not being usable inside virtual methods with generics, and since the timings show now significant downside of removing inline here, I propose to make it a normal generic function.

@abelbraaksma I suppose you are talking about the string operator on generics (maybe even those not constrained to implement IFormattable) and plain objects. In other cases where the object's type is known, I believe we should directly call the appropriate overload of ToString depending on whether the type implements IFormattable. There is no reason to insert an FSharp.Core function call when it's not needed. And the thing we definitely must avoid is boxing a value type in a core library function like string.

Other than that, you proposal seems good. Still I haven't understood what is the difference between improved and noinline. Could you show us the three functions for comparison? And did I correctly understand that we are going to use improved?

abelbraaksma · 2020-06-23T22:41:22Z

In other cases where the object's type is known, I believe we should directly call the appropriate overload of ToString depending on whether the type implements IFormattable.

It's a static type param in the original, so the object is always known. And the code to do what you suggest was there, but was dead voice, never executed. However, it blocks several important use cases by raising FS0670 in generic contexts.

With the current state of the compiler, it is not possible to have both.

And the thing we definitely must avoid is boxing a value type in a core library function like string.

I originally tried working on a solution that avoids this, but there's currently no way. And boxing is not a problem here, as calling the ToString function on any type is already a virtual call anyway. While the JIT has knowledge of this and will use 'call' semantics when it can, the test for IFormattable, or the cast, kind of changes that, since it's not generic, so it's going to be virtual again.

Also, ToString on any build in value type is very expensive (even after the optimizations in Core 3.0), and lead to hundreds of muops, that make the few extra muops for boxing dwarf.

If we add all these things together, and check the disassembly, we see there's no measurable difference. The timings support that too.

Since usually functionality trumps performance, and, apart from a slighter larger disassembly and (sometimes) larger IL, there isn't a performance drop, I would go for the non inlined version, so that it can be used in all contexts.

It's possible that at some point we'll introduce special casing for inlined generics that don't expose them (escape scope), (the same problem exists for piping, for instance with refs), and then we could reconsider this.

teo-tsirpanis · 2020-06-23T23:20:31Z

I see. The behavior I describe apparently needs compiler support for intrinsics (i.e. statically determining at the compiler's side which code to emit) which I assume does not exist (at least on the string operator).

It might be quite a big and complex feature but we must add it one day if we want to be serious about F#'s performance. Converting an object to a string is not a big deal itself, but the compiler could (and should) aggressively optimize code (an example that came to my mind is completely inlining the iter functions to for loops, avoiding closure allocations).

abelbraaksma · 2020-06-23T23:46:06Z

The behavior I describe apparently needs compiler support for intrinsics (i.e. statically determining at the compiler's side which code to emit) which I assume does not exist (at least on the string operator).

Yes and no. Yes, because using the when ^T: type syntax, which is available only in the compiler and FSharp.Core, in combination with the (# ...#) syntax, you do exactly what you describe.
No, because we can only use that syntax with static type resolution, that is, with inline functions. And string does not behave as we want while it is inline because of the described limitation with FS0670.

I agree it would be nice to have both. Meaning, like in the runtime and the BCL, where it is possible to code an exact intrinsic, regardless whether the function is inlined or not.

Btw, sometimes we have a little of both. For instance, isNull is not written in this syntax, because by not doing that, we can make better use of optimizations. Conversely, the function not was written using this syntax, but, even though it leads to the exact same IL code as if you were using normal syntax (like with function true -> false | false -> true), the optimization phase misses things. As a result, writing not x is fine and leads to a br.false IL code. And isNull x leads to a br.true 0x0. Writing not(not x)) will be erased, as will multiple isNull on the same var. But due to limitations of the optimizer, not(isNull x) leads to bad code, as described in #9433, which @cartermp is now trying to fix.

There will always be a tension field between the best way to code, optimize and jit something. It takes a lot of research for each of these functions. But ultimately, step by step, things will improve (and imo, the optimizations are already quite darn good, from what I've seen in my recent research for the String.xxx optimization PR's).

an example that came to my mind is completely inlining the iter functions to for loops, avoiding closure allocations

This is usually true, for instance String.iter does exactly that. But there are also certainly cases where it doesn't happen.

cartermp added Area-Compiler labels May 9, 2020

This was referenced Jun 20, 2020

Certain tests are never run, they are marked as "Content" in FSharp.Core.UnitTests #9515

Closed

Re enable tests for operators: OperatorsModule1.fs and OperatorsModule2.fs #9516

Merged

teo-tsirpanis mentioned this issue Jun 23, 2020

Redundant local variables sometimes apear #9547

Closed

This was referenced Jun 24, 2020

Optimize 'string', remove boxing where possible, prevent FS0670, simplify generic code (RFC-1089) #9549

Merged

Allow Operators.string to be used in more places (prevent FS0670); optimize generated IL code fsharp/fslang-suggestions#890

Closed

TIHan closed this as completed in #9549 Aug 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The 'string' operator produces suboptimal code when the object's type is known. #9153

The 'string' operator produces suboptimal code when the object's type is known. #9153

teo-tsirpanis commented May 8, 2020

charlesroddie commented May 10, 2020 •

edited

Loading

teo-tsirpanis commented May 10, 2020

charlesroddie commented May 10, 2020 •

edited

Loading

abelbraaksma commented May 10, 2020 •

edited

Loading

charlesroddie commented May 10, 2020

abelbraaksma commented May 10, 2020

abelbraaksma commented May 10, 2020 •

edited

Loading

abelbraaksma commented May 10, 2020 •

edited

Loading

abelbraaksma commented Jun 23, 2020 •

edited

Loading

teo-tsirpanis commented Jun 23, 2020 •

edited

Loading

abelbraaksma commented Jun 23, 2020 •

edited

Loading

teo-tsirpanis commented Jun 23, 2020

abelbraaksma commented Jun 23, 2020

The 'string' operator produces suboptimal code when the object's type is known. #9153

The 'string' operator produces suboptimal code when the object's type is known. #9153

Comments

teo-tsirpanis commented May 8, 2020

charlesroddie commented May 10, 2020 • edited Loading

teo-tsirpanis commented May 10, 2020

charlesroddie commented May 10, 2020 • edited Loading

abelbraaksma commented May 10, 2020 • edited Loading

charlesroddie commented May 10, 2020

abelbraaksma commented May 10, 2020

abelbraaksma commented May 10, 2020 • edited Loading

abelbraaksma commented May 10, 2020 • edited Loading

abelbraaksma commented Jun 23, 2020 • edited Loading

teo-tsirpanis commented Jun 23, 2020 • edited Loading

abelbraaksma commented Jun 23, 2020 • edited Loading

teo-tsirpanis commented Jun 23, 2020

abelbraaksma commented Jun 23, 2020

charlesroddie commented May 10, 2020 •

edited

Loading

charlesroddie commented May 10, 2020 •

edited

Loading

abelbraaksma commented May 10, 2020 •

edited

Loading

abelbraaksma commented May 10, 2020 •

edited

Loading

abelbraaksma commented May 10, 2020 •

edited

Loading

abelbraaksma commented Jun 23, 2020 •

edited

Loading

teo-tsirpanis commented Jun 23, 2020 •

edited

Loading

abelbraaksma commented Jun 23, 2020 •

edited

Loading