-
-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up and optimize byte<->float and Rgba32 <-> Vector4 conversion #742
Conversation
in BulkConvertByteToNormalizedFloat() and BulkConvertNormalizedFloatToByteClampOverflows()
Codecov Report
@@ Coverage Diff @@
## master #742 +/- ##
=========================================
Coverage ? 89.33%
=========================================
Files ? 972
Lines ? 42891
Branches ? 3038
=========================================
Hits ? 38318
Misses ? 3889
Partials ? 684
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #742 +/- ##
=========================================
Coverage ? 89.31%
=========================================
Files ? 973
Lines ? 42979
Branches ? 3047
=========================================
Hits ? 38386
Misses ? 3911
Partials ? 682
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll freely admit. There's some stuff here I don't fully understand but what I do seems sensible. Just a few questions.
} | ||
|
||
[MethodImpl(InliningOptions.ShortMethod)] | ||
public static float Clamp(float x, float min, float max) => Math.Min(max, Math.Max(min, x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might wanna benchmark this. The IComparableExtensions
version should be a good bit faster.
I do, however want to ditch the extension method for more clear MathUtils.Clamp***
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will do compare!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one in ComparableExtensions
is faster, I'm removing this!
/// http://lolengine.net/blog/2011/3/20/understanding-fast-float-integer-conversions | ||
/// http://stackoverflow.com/a/536278 | ||
/// </summary> | ||
internal static void BulkConvertByteToNormalizedFloat(ReadOnlySpan<byte> source, Span<float> dest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this perhaps be private so it cannot be called without sanitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods are all unit tested separately (so coverage is independent from current HW configuration), so we need them as internal
.
I can improve the input checking a bit however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right... Makes sense then. Carry on!
{ | ||
public static bool IsAvailable { get; } = | ||
#if NETCOREAPP2_1 | ||
// TODO: Also available in .NET 4.7.2, we need to add a build target! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add one if you need one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to do this in a separate PR, or open an up-for-grabs issue. @iamcarbon is the champion of this stuff! 😄
s *= maxBytes; | ||
s += half; | ||
|
||
// I'm not sure if Vector4.Clamp() is properly implemented with intrinsics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do recall reading somewhere it wasn't.
|
||
public override string ToString() | ||
{ | ||
return $"[{this.V0},{this.V1},{this.V2},{this.V3},{this.V4},{this.V5},{this.V6},{this.V7}]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm favoring the TypeName(field, field, field)
format now for ToString constituency.
@@ -6,7 +6,7 @@ | |||
using System.Runtime.CompilerServices; | |||
using System.Runtime.InteropServices; | |||
|
|||
using SixLabors.ImageSharp.Common.Tuples; | |||
using SixLabors.ImageSharp.Tuples; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would these come under our Primitives namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely a better place!
@@ -29,10 +29,12 @@ public partial class PixelOperations<TPixel> | |||
/// <param name="count">The number of pixels to convert.</param> | |||
internal virtual void PackFromVector4(ReadOnlySpan<Vector4> sourceVectors, Span<TPixel> destinationColors, int count) | |||
{ | |||
GuardSpans(sourceVectors, nameof(sourceVectors), destinationColors, nameof(destinationColors), count); | |||
ReadOnlySpan<Vector4> sourceVectors1 = sourceVectors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come these are reassigned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Result of undoing a refactor with R#. Will fix it.
return ImageMaths.ModuloP2(this.value, this.m); | ||
} | ||
|
||
// RESULTS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to make sure I add the results like this when benchmarking. Really good idea.
} | ||
|
||
[MethodImpl(InliningOptions.ShortMethod)] | ||
public static float Clamp(float x, float min, float max) => Math.Min(max, Math.Max(min, x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the duplication?
/// <inheritdoc /> | ||
internal override void ToVector4(ReadOnlySpan<Rgba32> sourceColors, Span<Vector4> destinationVectors, int count) | ||
{ | ||
Guard.MustBeSizedAtLeast(sourceColors, count, nameof(sourceColors)); | ||
Guard.MustBeSizedAtLeast(destinationVectors, count, nameof(destinationVectors)); | ||
|
||
if (count < 256 || !Vector.IsHardwareAccelerated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we no longer need the count check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was an optimization for small buffers, but the new logic made it obsolete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay 👍 Just wanted to make sure that you not missed this.
@@ -40,7 +44,7 @@ public void Cleanup() | |||
this.source.Dispose(); | |||
} | |||
|
|||
[Benchmark(Baseline = true)] | |||
//[Benchmark] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this still be commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmark code serves as an information, but the execution is unnecessary, unless someone wants to evaluate that specific method in future investigations.
I wish there was some better way to Skip benchmarks without completely dropping their code.
@@ -38,33 +50,171 @@ public void Cleanup() | |||
this.destination.Dispose(); | |||
} | |||
|
|||
[Benchmark(Baseline = true)] | |||
//[Benchmark] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this still be commented?
@JimBobSquarePants @dlemstra all findings were addressed. Gonna merge this as soon as the compilation is finished so we can go on with #729. |
Awesome! |
Prerequisites
Description
Span<byte>
->Span<float>
(and opposite) conversion methods inSimdUtils
which would be useful for Epic: ResizeProcessor performance improvements (Memory & CPU) #733.Rgba32.PixelOperations
to consume these uniformized converters in bulk conversions to/fromVector4
Span<byte>
->Span<float>
(thusSpan<Rgba32>
->Span<Vector4>
) is 3x faster than the current implementation onmain
Benchmark results
Detailed benchmark results can be found in comments in benchmark code (ToVector4_Rgba32, PackFromVector4_Rgba32).
Here are the interesting bits:
ToVector4_Rgba32
PackFromVector4_Rgba32