Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDE0305 recommends collection expression that results in worse IL #70656

Closed
stephentoub opened this issue Nov 1, 2023 · 18 comments · Fixed by #71195
Closed

IDE0305 recommends collection expression that results in worse IL #70656

stephentoub opened this issue Nov 1, 2023 · 18 comments · Fixed by #71195
Assignees
Labels
Area-Compilers Code Gen Quality Room for improvement in the quality of the compiler's generated code Feature - Collection Expressions
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented Nov 1, 2023

Version Used:
Version 17.9.0 Preview 1.0 [34231.268.main]

Steps to Reproduce:

static int[] GetArray(List<int> list) => list.ToArray(); 

IDE0305 recommends rewriting this to:

static int[] GetArray(List<int> list) => [.. list];

which would be fine if that still compiled down to a list.ToArray() call, but instead it compiles down to the equivalent of:

int[] array = new int[list.Count];
int num = 0;
foreach (int item in list)
{
    array[num] = item;
    num++;
}

which is significantly less efficient, especially for longer lists where the entire copy would instead be done as a vectorized memcpy invocation.

It's also worse if the list was actually empty: List<T>.ToArray() will return an empty array singleton if the count is 0, but the above code will end up always allocating a new empty array 😦

cc: @cston, @CyrusNajmabadi

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Nov 1, 2023
@CyrusNajmabadi
Copy link
Member

Compiler is allowed to, and should use the best emit here. For bcl types, that should be ToArray. Tagging @RikkiGibson

@RikkiGibson
Copy link
Contributor

RikkiGibson commented Nov 1, 2023

I agree that we should fix by improving emit here.

It sounds like this would be a special case for a collection-expr of a single spread expression? Are there any other possible spread types where we should be looking for a ToArray method? e.g. Span<int> span = ...; int[] arr = [..span];?

Also, suppose we have a similar case ImmutableArray<int> ia = [..list]. Should we use ToArray then AsImmutableArray? Should we look for an AsImmutableArray method?

@RikkiGibson RikkiGibson self-assigned this Nov 1, 2023
@RikkiGibson RikkiGibson added this to the 17.9 milestone Nov 1, 2023
@CyrusNajmabadi
Copy link
Member

We should special case as many reasonable cases as possible. But we can do it as bug fixes imo. Generally, if it's a trivial check (checking a few types and a few elements), then we should do it. Ideally in a way that is easy to maintain and enhance in the future! :-)

@RikkiGibson RikkiGibson added Area-Compilers Code Gen Quality Room for improvement in the quality of the compiler's generated code Feature - Collection Expressions and removed Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Nov 1, 2023
@RikkiGibson
Copy link
Contributor

RikkiGibson commented Nov 15, 2023

I was hoping to take this optimization and generalize it somewhat: instead of just improving the situation for collection-exprs like [..span], improve it for more complex scenarios like [..span1, ..span2, .., spanN] or [..span1, elem1, ..span2, elem2] and so on.

I thought we could do this with 2 changes:

  1. Emit a length check on the input collections, when a collection-expr consists only of spreads. When length is 0 use Array.Empty.
  2. Use CopyTo where applicable to copy the input collection into the appropriate section of the resulting array.

I tried to benchmark to prove out the viability of this. However, I was surprised by the results, especially that the ToArray case doesn't seem to be allocating. I think there might be something wrong with my benchmark. Results and source below:

edit: out of date.
// * Summary *

BenchmarkDotNet v0.13.10, Windows 11 (10.0.22621.2715/22H2/2022Update/SunValley2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.100
  [Host]     : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
Method N Mean Error StdDev Gen0 Allocated
CopyTo 0 0.2060 ns 0.0151 ns 0.0126 ns - -
ToArray 0 0.0436 ns 0.0198 ns 0.0185 ns - -
CopyTo 10 8.6380 ns 0.2261 ns 0.5956 ns 0.0076 64 B
ToArray 10 0.0505 ns 0.0143 ns 0.0134 ns - -
CopyTo 10000 698.3151 ns 6.7082 ns 15.1415 ns 4.7617 40024 B
ToArray 10000 0.0836 ns 0.0498 ns 0.0466 ns - -
using System;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace MyBenchmarks
{
    [MemoryDiagnoser]
    public class CopyToBenchmarks
    {
        [Params(0, 10, 10000)]
        public int N;
        private readonly int[] data;


        public CopyToBenchmarks()
        {
            data = new int[N];
            var random = new Random();
            for (var i = 0; i < data.Length; i++)
            {
                data[i] = random.Next();
            }
        }

        [Benchmark]
        public int[] CopyTo()
        {
            int[] dest;
            if (N == 0)
            {
                dest = Array.Empty<int>();
            }
            else
            {
                dest = new int[N];
                data.CopyTo(dest.AsSpan());
            }
            return dest;
        }

        [Benchmark]
        public int[] ToArray()
        {
            int[] dest = data.AsSpan().ToArray();
            return dest;
        }
    }

    public class Program
    {
        public static void Main(string[] args)
        {
            var summary = BenchmarkRunner.Run<CopyToBenchmarks>();
        }
    }
}

Any insight on what could be happening here would be appreciated. I also tried comparing JIT ASM for ToArray versus CopyTo and...yeah, I found they looked totally different and didn't know how to meaningfully compare them.

@CyrusNajmabadi
Copy link
Member

@RikkiGibson are you sure the N is set before the constructor runs? If not, you will always have empty arrays.

@RikkiGibson
Copy link
Contributor

That was it :) Including new results and benchmark below.

BenchmarkDotNet v0.13.10, Windows 11 (10.0.22621.2715/22H2/2022Update/SunValley2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.100
  [Host]     : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
Method N Mean Error StdDev Median Gen0 Allocated
CopyTo 0 0.0367 ns 0.0339 ns 0.0318 ns 0.0207 ns - -
LinearAssignTo 0 0.0192 ns 0.0374 ns 0.0350 ns 0.0000 ns - -
ToArray 0 0.1481 ns 0.0136 ns 0.0127 ns 0.1480 ns - -
CopyTo 10 8.6745 ns 0.2380 ns 0.6906 ns 8.4643 ns 0.0076 64 B
LinearAssignTo 10 11.5731 ns 0.2830 ns 0.8254 ns 11.2553 ns 0.0076 64 B
ToArray 10 7.9667 ns 0.2114 ns 0.5678 ns 7.8651 ns 0.0076 64 B
CopyTo 10000 1,354.3466 ns 26.8067 ns 61.0525 ns 1,332.0464 ns 4.7607 40024 B
LinearAssignTo 10000 5,704.2089 ns 110.8084 ns 144.0822 ns 5,760.8376 ns 4.7607 40024 B
ToArray 10000 1,364.2671 ns 25.2483 ns 52.7025 ns 1,350.2026 ns 4.7607 40024 B
using System;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

namespace MyBenchmarks
{
    [MemoryDiagnoser]
    public class CopyToBenchmarks
    {
        [Params(0, 10, 10000)]
        public int N;
        private int[] data = null!;

        [GlobalSetup]
        public void Setup()
        {
            data = new int[N];
            var random = new Random();
            for (var i = 0; i < data.Length; i++)
            {
                data[i] = random.Next();
            }
        }

        [Benchmark]
        public int[] CopyTo()
        {
            int[] dest;
            if (N == 0)
            {
                dest = Array.Empty<int>();
            }
            else
            {
                dest = new int[N];
                data.CopyTo(dest.AsSpan());
            }
            return dest;
        }

        [Benchmark]
        public int[] LinearAssignTo()
        {
            int[] dest;
            if (N == 0)
            {
                dest = Array.Empty<int>();
            }
            else
            {
                dest = new int[N];
                for (var i = 0; i < N; i++)
                    dest[i] = data[i];
            }
            return dest;
        }

        [Benchmark]
        public int[] ToArray()
        {
            int[] dest = data.AsSpan().ToArray();
            return dest;
        }
    }

    public class Program
    {
        public static void Main(string[] args)
        {
            var summary = BenchmarkRunner.Run<CopyToBenchmarks>();
        }
    }
}

The perf difference seems pretty close between CopyTo and ToArray. If it seems reasonable to everybody I might just focus my implementation efforts on using CopyTo for arrays, spans and lists (using CollectionsMarshal to copy to the span referencing the list's backing array.)

@CyrusNajmabadi
Copy link
Member

The perf difference seems pretty close between CopyTo and ToArray. If it seems reasonable to everybody I might just focus my implementation efforts on using CopyTo for arrays, spans and lists (using CollectionsMarshal to copy to the span referencing the list's backing array.)

That is reasonable to me.

@stephentoub
Copy link
Member Author

stephentoub commented Nov 16, 2023

The perf difference seems pretty close between CopyTo and ToArray. If it seems reasonable to everybody I might just focus my implementation efforts on using CopyTo for arrays, spans and lists (using CollectionsMarshal to copy to the span referencing the list's backing array.)

That is reasonable to me.

From a throughput perspective, yup, the difference is minimal. From a code size perspective, it'd be nice if ToArray could be used for [..list], but that could also be a follow-up improvement. (Although there is one benefit to just always doing your cited approach... it'll use Array.Empty for length 0, which .NET Core's List<T>.ToArray also does, but .NET Framework's doesn't.)

@OJacot-Descombes

This comment was marked as off-topic.

@CyrusNajmabadi
Copy link
Member

It would be nice if converting ToList() (and similar) would have a different IDE issue number

They do. The number for new[] { a, b, c,} is not the same. For thie exact reason :)

@Balkoth

This comment was marked as off-topic.

@CyrusNajmabadi

This comment was marked as off-topic.

@Balkoth

This comment was marked as off-topic.

@Sergio0694
Copy link
Contributor

Just noticed this (sharplab):

public static IList<int> M() => [1, 2, 8, 4];
List<int> list = new List<int>();
CollectionsMarshal.SetCount(list, 4);
Span<int> span = CollectionsMarshal.AsSpan(list);
int num = 0;
span[num] = 1;
num++;
span[num] = 2;
num++;
span[num] = 8;
num++;
span[num] = 4;
num++;
return list;

It seems there's a fair amount of extra IL just to keep track of that index that's incremented every time, which also makes it not a constant (which might be less useful for the JIT). Is there any specific reason why Roslyn is doing this rather than just emitting:

List<int> list = new List<int>();
CollectionsMarshal.SetCount(list, 4);
Span<int> span = CollectionsMarshal.AsSpan(list);
span[0] = 1;
span[1] = 2;
span[2] = 8;
span[3] = 4;
return list;

The latter might even allow the JIT to vectorize some of these assignments, in theory (also the IL seems smaller too?).

@CyrusNajmabadi
Copy link
Member

Is there any specific reason why Roslyn is doing this

Yes. It was the simplest way to do code Gen, and we have not focused on optimizations.

@cston
Copy link
Member

cston commented Dec 8, 2023

Thanks @Sergio0694, yes we should use constant indices for this case. I've created #71183 from your comment.

@MichalPetryka
Copy link

MichalPetryka commented Dec 8, 2023

For big, constant data ReadOnlySpan.CopyTo with an RVA static should probably be used instead too.

@RikkiGibson
Copy link
Contributor

Also wanted to show a comparison of AddRange, CopyTo, and ToList for List<T>, as candidate strategies for a collection-expr containing spreads.

Benchmark source
[MemoryDiagnoser]
public class CopyToListBenchmarks
{
    [Params(0, 10, 10000)]
    public int N;
    private int[] data = null!;

    [GlobalSetup]
    public void Setup()
    {
        data = new int[N];
        var random = new Random();
        for (var i = 0; i < data.Length; i++)
        {
            data[i] = random.Next();
        }
    }

    [Benchmark]
    public List<int> ToList()
    {
        List<int> dest = data.ToList();
        return dest;
    }

    [Benchmark]
    public List<int> AddRange()
    {
        List<int> dest = new(capacity: N);
        dest.AddRange(data);
        return dest;
    }

    [Benchmark]
    public List<int> CopyTo()
    {
        List<int> dest = new();
        CollectionsMarshal.SetCount(dest, N);
        data.CopyTo(CollectionsMarshal.AsSpan(dest));
        return dest;
    }

    [Benchmark]
    public List<int> LinearAdd()
    {
        List<int> dest = new(capacity: N);
        for (var i = 0; i < N; i++)
            dest.Add(data[i]);
        return dest;
    }

    [Benchmark]
    public List<int> LinearSpanAssign()
    {
        List<int> dest = new();
        CollectionsMarshal.SetCount(dest, N);
        var span = CollectionsMarshal.AsSpan(dest);
        for (var i = 0; i < N; i++)
            span[i] = data[i];
        return dest;
    }
}
Method N Mean Error StdDev Median Allocated
ToList 0 18.96 ns 0.354 ns 0.314 ns - 32 B
AddRange 0 14.0333 ns 0.3155 ns 0.2952 ns 13.9249 ns 32 B
CopyTo 0 7.7082 ns 0.2069 ns 0.4795 ns 7.7568 ns 32 B
LinearAdd 0 3.8187 ns 0.1294 ns 0.3733 ns 3.6680 ns 32 B
LinearSpanAssign 0 4.5969 ns 0.1354 ns 0.1330 ns 4.5992 ns 32 B
ToList 10 27.60 ns 0.294 ns 0.229 ns - 96 B
AddRange 10 24.0674 ns 0.5112 ns 0.4782 ns 23.9799 ns 96 B
CopyTo 10 13.6428 ns 0.3174 ns 0.3117 ns 13.6788 ns 96 B
LinearAdd 10 25.1704 ns 0.5362 ns 0.5507 ns 25.1363 ns 96 B
LinearSpanAssign 10 19.8500 ns 0.3951 ns 0.3695 ns 19.6693 ns 96 B
ToList 10000 1,385.12 ns 25.068 ns 22.222 ns - 40056 B
AddRange 10000 1,460.66 ns 28.996 ns 68.913 ns 1,430.7083 ns 40056 B
CopyTo 10000 1,440.81 ns 28.855 ns 72.393 ns 1,401.1523 ns 40056 B
LinearAdd 10000 18,523.55 ns 366.359 ns 990.474 ns 18,228.4912 ns 40056 B
LinearSpanAssign 10000 6,976.51 ns 137.266 ns 121.683 ns 6,930.2616 ns 40056 B

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Code Gen Quality Room for improvement in the quality of the compiler's generated code Feature - Collection Expressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants