Add .NET Core 2.1 and 3.0 perf improvements #19

brantburnett · 2020-10-17T04:20:51Z

The addition of Span in .NET Core 2.1 can offer some performance
improvements moving through the array in SafeProxy by reducing the
number of arithmetic operations.

.NET Core 3.0 also adds Span based overloads to HashAlgorithm
which can further improve performance if explicitly supported. If not
supported, any requests to the Span overloads are copied to an
array before processing.

A BenchmarkDotNet project was also added to assist with benchmarking.

Test results across several target frameworks comparing the pre and post
change performance against a 65536 byte array. These metrics are for
calls in via the array overloads, not the Span overloads. They
show an approximately 25% reduction in runtime on .NET Core 2.1 and 3.1.

Method	Runtime	Size	Mean	Error	StdDev	Ratio	Rank
Array	.NET 4.6.1	65536	48.08 us	0.192 us	0.170 us	1.00	1
Span	.NET 4.6.1	65536	47.87 us	0.169 us	0.150 us	1.00	1

Array	.NET Core 2.1	65536	48.99 us	0.260 us	0.217 us	1.00	2
Span	.NET Core 2.1	65536	37.02 us	0.261 us	0.218 us	0.76	1

Array	.NET Core 3.1	65536	50.01 us	0.335 us	0.297 us	1.00	2
Span	.NET Core 3.1	65536	37.04 us	0.218 us	0.204 us	0.74	1

The addition of Span<T> in .NET Core 2.1 can offer some performance improvements moving through the array in SafeProxy by reducing the number of arithmetic operations. .NET Core 3.0 also adds Span<byte> based overloads to HashAlgorithm which can further improve performance if explicitly supported. If not supported, any requests to the Span<byte> overloads are copied to an array before processing. A BenchmarkDotNet project was also added to assist with benchmarking. Test results across several target frameworks comparing the pre and post change performance against a 65536 byte array. These metrics are for calls in via the array overloads, not the Span<byte> overloads. They show an approximately 25% reduction in runtime on .NET Core 2.1 and 3.1. | Method | Runtime | Size | Mean | Error | StdDev | Ratio | Rank | |------- |-------------- |------ |---------:|---------:|---------:|------:|-----:| | Array | .NET 4.6.1 | 65536 | 48.08 us | 0.192 us | 0.170 us | 1.00 | 1 | | Span | .NET 4.6.1 | 65536 | 47.87 us | 0.169 us | 0.150 us | 1.00 | 1 | | | | | | | | | | | Array | .NET Core 2.1 | 65536 | 48.99 us | 0.260 us | 0.217 us | 1.00 | 2 | | Span | .NET Core 2.1 | 65536 | 37.02 us | 0.261 us | 0.218 us | 0.76 | 1 | | | | | | | | | | | Array | .NET Core 3.1 | 65536 | 50.01 us | 0.335 us | 0.297 us | 1.00 | 2 | | Span | .NET Core 3.1 | 65536 | 37.04 us | 0.218 us | 0.204 us | 0.74 | 1 |

Skyppid · 2020-11-10T08:58:52Z

Thanks for adding this as it has long been requested in #11. Kinda sad tho' that it's not really maintained anymore, I'd love to just use the nuget package instead of having to compile it myself.

force-net · 2020-11-10T09:12:16Z

Thanks for PR. I'll do my best to find time to merge it and publish new package.
I'm alive, but really has problems with spare time

Skyppid · 2020-11-10T10:07:23Z

Great to hear, thanks @force-net. No worries, I guess everyone understands that.

brantburnett · 2021-08-11T14:30:10Z

@force-net Have you had a chance to look at this yet?

force-net · 2021-08-11T18:59:08Z

@brantburnett I'm really sorry. I'm trying to take some vacation to fix issues and merge PR. Lot of work and other stuff.

lugospod · 2021-10-06T22:23:14Z

@force-net Any news regarding this merge?

Also, it would be usefull to include ReadOnlyMemory support..so we don't have to allocate memory to provide this library byte[].

brantburnett · 2021-10-07T11:30:35Z

@force-net Any news regarding this merge?

Also, it would be usefull to include ReadOnlyMemory support..so we don't have to allocate memory to provide this library byte[].

This change adds ReadOnlyMemory support via the use of ReadOnlySpan. ReadOnlyMemory has a Span property.

lugospod · 2021-10-07T12:33:30Z

@brantburnett Tnx.. I found it last night... I decided to stop waiting for the new release and just extract the code I need because obviously nuget losses its benefits if one has to wait so long (not blaming the team, that's just life :))

arnoldsi-vii · 2022-02-26T21:56:09Z

@force-net any updates regarding this merge?

force-net · 2022-02-27T10:30:03Z

Sorry, I still do not have time to merge and review all changes. But I hope, I'll find it.

arnoldsi-vii · 2022-03-09T22:41:43Z

@force-net thank you

neon-sunset · 2022-08-22T16:09:24Z

@brantburnett @arnoldsi-vii it appears there is https://www.nuget.org/packages/System.IO.Hashing/ now.

It doesn't seem to use any kind of loop unrolling in its implementation however and I haven't yet tested its performance vs this library: https://github.com/dotnet/runtime/blob/main/src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc32.cs
It does however accept ROS<byte>.

brantburnett · 2022-08-22T16:56:32Z

@neon-sunset

Based on my quick review, I agree that implementation looks slower on the calculation side, though the use of ReadOnlySpan may make up for some of that. It's also possible that modern JIT doesn't need the manual loop unrolling, not sure. It also doesn't use Intrinsics to use CPU optimizations (my next planned improvement) nor does it have a CRC32C algorithm.

However, it may make sense to try to move these optimizations to that official library rather than trying to get this library maintained again.

Note that there is other work in progress that adds some optimizations in a different spot: dotnet/runtime#61558

neon-sunset · 2022-08-22T17:09:05Z

@brantburnett
Unfortunately, JIT does not do any loop unrolling or auto-vectorization as of today. It can do loop cloning for independent operations but this doesn't seem to be applicable in our case.

As for the mentioned PR, its purpose is a little bit different: .NET 7 introduces fully cross-platform vector operations. What I mean by this is that pre-.NET 7 it was necessary to reference exact intrinsics like Avx2.DoStuff or AdvSimd.DoStuffButArm which meant a lot of code duplication. However, it is now possible to write an algorithm on top of Vector<T>/VectorXXX<T> and its extensions once, and it will compile to corresponding efficient codegen that uses vector instructions supported on the target platform (there are limitations - using Vector256 on arm64 will cause it to fallback to scalar code instead of producing unrolled Vector128 operations).

The reason I mention this is that dotnet/runtime#61558 appears to be following the same approach ensuring that we can just use a single method to compute checksum for a primitive value which will either use available CRC32 intrinsics or fallback fast implementation for specific platform.

However, our use case here is a little bit different so I totally agree with you on the suggestion to upstream improvements for System.IO.Hashing instead.

brantburnett · 2023-04-23T13:30:44Z

For Crc32 (though unfortunately not yet Crc32C) there is now an even more performant implementation using vectorization and polynomial multiplication that has been merged into System.IO.Hashing.

dotnet/runtime#83321

This will be included in the .NET 8 release of the System.IO.Hashing NuGet (probably in preview 4). The vectorization improvements are also backward compatible to .NET 7 and the ARM scalar improvements to .NET 6 if you use the new package.

brantburnett force-pushed the span branch from 61f1f4d to 37e88ae Compare October 18, 2020 14:20

brantburnett mentioned this pull request Oct 19, 2020

Support Crc32C Hardware Intrinsics on .NET Core 3 #16

Open

jasper-d mentioned this pull request Mar 22, 2021

Add .NET Core 2.1 and 3.0 perf improvements jasper-d/Crc32.NET#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add .NET Core 2.1 and 3.0 perf improvements #19

Add .NET Core 2.1 and 3.0 perf improvements #19

brantburnett commented Oct 17, 2020

Skyppid commented Nov 10, 2020

force-net commented Nov 10, 2020

Skyppid commented Nov 10, 2020

brantburnett commented Aug 11, 2021

force-net commented Aug 11, 2021

lugospod commented Oct 6, 2021

brantburnett commented Oct 7, 2021

lugospod commented Oct 7, 2021

arnoldsi-vii commented Feb 26, 2022

force-net commented Feb 27, 2022

arnoldsi-vii commented Mar 9, 2022

neon-sunset commented Aug 22, 2022 •

edited

Loading

brantburnett commented Aug 22, 2022

neon-sunset commented Aug 22, 2022 •

edited

Loading

brantburnett commented Apr 23, 2023

Add .NET Core 2.1 and 3.0 perf improvements #19

Are you sure you want to change the base?

Add .NET Core 2.1 and 3.0 perf improvements #19

Conversation

brantburnett commented Oct 17, 2020

Skyppid commented Nov 10, 2020

force-net commented Nov 10, 2020

Skyppid commented Nov 10, 2020

brantburnett commented Aug 11, 2021

force-net commented Aug 11, 2021

lugospod commented Oct 6, 2021

brantburnett commented Oct 7, 2021

lugospod commented Oct 7, 2021

arnoldsi-vii commented Feb 26, 2022

force-net commented Feb 27, 2022

arnoldsi-vii commented Mar 9, 2022

neon-sunset commented Aug 22, 2022 • edited Loading

brantburnett commented Aug 22, 2022

neon-sunset commented Aug 22, 2022 • edited Loading

brantburnett commented Apr 23, 2023

neon-sunset commented Aug 22, 2022 •

edited

Loading

neon-sunset commented Aug 22, 2022 •

edited

Loading