-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System.Linq Orderby performance optimization #46414
Conversation
Tagging subscribers to this area: @eiriktsarpalis Issue DetailsWhen performing OrderBy an int[] map is used to track the order of the items in the buffer. In the cases when there isn't a secondary sort however the items in the buffer can be sorted in place using ArraySortHelper which takes a Span of keys and a separate Span of items. In GetEnumerator and ToList this removes the need for the int[] map allocation and in ToArray it reuses the buffer array instead of allocating a new one.
|
Do you have numbers for the case where there is a secondary sort? e.g. a small array with a trivial primary and secondary sort? In that case we're paying for the virtual dispatch and reflection checks but not getting anything for them. |
@stephentoub
|
@eiriktsarpalis Is anything else needed here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional feedback:
- I'd be interested to see benchmark results when the elements are large-ish structs.
- There were build failures but I couldn't see them due to the age of the build. Would it be possible to rebase your changes over the latest
main
?
return false; | ||
} | ||
Type elementType = typeof(TElement); | ||
return elementType.IsValueType || (elementType.IsClass && !elementType.IsArray); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This expression could be optimized away by the JIT if you avoided assigning typeof(TElement)
to an intermediate variable:
@@ -158,6 +173,16 @@ internal override CachingComparer<TElement> GetComparer(CachingComparer<TElement | |||
: new CachingComparerWithChild<TElement, TKey>(_keySelector, _comparer, _descending, childComparer); | |||
return _parent != null ? _parent.GetComparer(cmp) : cmp; | |||
} | |||
|
|||
protected override bool CanInPlaceSort() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a good place to write a comment detailing the motivations for this optimization.
@@ -16,6 +16,8 @@ internal abstract partial class OrderedEnumerable<TElement> : IOrderedEnumerable | |||
|
|||
private int[] SortedMap(Buffer<TElement> buffer) => GetEnumerableSorter().Sort(buffer._items, buffer._count); | |||
|
|||
private void SortElements(Buffer<TElement> buffer) => GetEnumerableSorter().SortElements(buffer._items, buffer._count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use a name that emphasizes that this is an in-place sort, i.e. buffer
is going to get mutated.
@@ -343,6 +376,9 @@ internal override int CompareAnyKeys(int index1, int index2) | |||
protected override void QuickSort(int[] keys, int lo, int hi) => | |||
new Span<int>(keys, lo, hi - lo + 1).Sort(CompareAnyKeys); | |||
|
|||
protected override void QuickSortElements(TElement[] element, int lo, int hi) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I would call this something like QuickSortInPlace
or something to that effect.
@@ -343,6 +376,9 @@ internal override int CompareAnyKeys(int index1, int index2) | |||
protected override void QuickSort(int[] keys, int lo, int hi) => | |||
new Span<int>(keys, lo, hi - lo + 1).Sort(CompareAnyKeys); | |||
|
|||
protected override void QuickSortElements(TElement[] element, int lo, int hi) => | |||
new Span<TKey>(_keys, lo, hi - lo + 1).Sort(new Span<TElement>(element, lo, hi-lo +1), _comparer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem to be taking descending sorts into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch, I tried using the overload taking Comparison<TKey> and made a function like CompareAnyKeys. This is causing a huge perf degredation, it appears that it is due to the additional method call to the Comparable<TKey> created in Sort and and the call to new new Comparison<TKey>. Since Sort doesn't support descending the only way to maintain this Perf is to restrict this to only when doing ascending which I think is a bad idea. If you don't have any further suggestion I'll abandon this PR.
Method | Job | Toolchain | N | Mean | Error | StdDev | Median | Min | Max | Gen 0 | Gen 1 | Gen 2 | Allocated |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LongToList | Job-KMOQSK | IComparison<TKey> | 200 | 15,966.5 ns | 736.64 ns | 788.20 ns | 15,986.1 ns | 14,693.0 ns | 17,493.0 ns | 1.1421 | - | - | 5104 B |
LongToList | Job-BJFYTT | upstream | 200 | 15,546.1 ns | 773.05 ns | 859.24 ns | 15,410.0 ns | 14,292.5 ns | 17,462.7 ns | 1.3152 | - | - | 5904 B |
LongToList | Job-BCHDBG | Comparable<TKey> | 200 | 3,087.8 ns | 177.38 ns | 189.79 ns | 3,065.8 ns | 2,846.0 ns | 3,512.2 ns | 1.1559 | - | - | 5016 B |
Thanks for all your efforts to try to improve things here, @johnthcall. Given the data, let's go ahead and close this. |
When performing OrderBy an int[] map is used to track the order of the items in the buffer. In the cases when there isn't a secondary sort however the items in the buffer can be sorted in place using ArraySortHelper which takes a Span of keys and a separate Span of items. In GetEnumerator and ToList this removes the need for the int[] map allocation and in ToArray it reuses the buffer array instead of allocating a new one.
The two open issues are:
https://github.com/dotnet/performance/compare/master...johnthcall:OrderBySingleSort?expand=1