Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating some readonly static data in JpegEncoderCore to take advantage of compiler functionality. #855

Merged
merged 1 commit into from
Mar 26, 2019

Conversation

tannergooding
Copy link
Contributor

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following matches the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This is a partial fix of #854 and shows how to take advantage of the underlying compiler functionality.

@CLAassistant
Copy link

CLAassistant commented Mar 21, 2019

CLA assistant check
All committers have signed the CLA.

@tannergooding
Copy link
Contributor Author

BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17763.379 (1809/October2018Update/Redstone5)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview4-010901
  [Host]     : .NET Core 2.1.9 (CoreCLR 4.6.27414.06, CoreFX 4.6.27415.01), 64bit RyuJIT
  DefaultJob : .NET Core 2.1.9 (CoreCLR 4.6.27414.06, CoreFX 4.6.27415.01), 64bit RyuJIT

Before:

Method TestImage Mean Error StdDev Median Ratio RatioSD
'System.Drawing Jpeg' Bmp/Car.bmp 4.817 ms 0.1838 ms 0.5419 ms 4.985 ms 1.00 0.00
'ImageSharp Jpeg' Bmp/Car.bmp 7.079 ms 0.4893 ms 1.4428 ms 6.105 ms 1.51 0.44

After:

Method TestImage Mean Error StdDev Median Ratio RatioSD
'System.Drawing Jpeg' Bmp/Car.bmp 4.857 ms 0.1933 ms 0.5698 ms 5.036 ms 1.00 0.00
'ImageSharp Jpeg' Bmp/Car.bmp 5.716 ms 0.0762 ms 0.0595 ms 5.716 ms 1.29 0.15

@codecov
Copy link

codecov bot commented Mar 21, 2019

Codecov Report

Merging #855 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #855   +/-   ##
=======================================
  Coverage   88.89%   88.89%           
=======================================
  Files        1014     1014           
  Lines       44295    44295           
  Branches     3208     3209    +1     
=======================================
  Hits        39376    39376           
  Misses       4198     4198           
  Partials      721      721
Impacted Files Coverage Δ
...c/ImageSharp/Common/Extensions/StreamExtensions.cs 88.88% <ø> (ø) ⬆️
src/ImageSharp/Formats/Jpeg/JpegEncoderCore.cs 94.71% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 47e8f2c...1e2c32d. Read the comment docs.

// This is a port of the CoreFX implementation and is MIT Licensed: https://github.com/dotnet/coreclr/blob/c4dca1072d15bdda64c754ad1ea474b1580fa554/src/System.Private.CoreLib/shared/System/IO/Stream.cs#L770
public static void Write(this Stream stream, ReadOnlySpan<byte> buffer)
{
byte[] sharedBuffer = ArrayPool<byte>.Shared.Rent(buffer.Length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually prefer using our own memory management primitives like MemoryAllocator.AllocateManagedByteBuffer() instead of ArrayPool.Shared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is matching a signature and implementation available in .NET Core. ArrayPool<byte>.Shared should generally be allocation-less for something like this (not necessarily for other T) since most of the framework is fairly dependent on it.

I'll run benchmarks on net472 as well though, in order to see what the overhead here is, if any, since it isn't applicable to netcoreapp2.1 (where the above benchmark was run).

Copy link
Member

@antonfirsov antonfirsov Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our "allocator" is typically also an allocation free pool, but we definitely have an overhead because of nested virtual calls, and other infrastructure stuff.
Everything depends on the size of the array. It's not uncommon to have very large (> 1 MB) buffers in image processing, which works better with our allocator/pool in my experience.

However, for those buffers we should avoid invoking this extension method anyways. @JimBobSquarePants we can probably make an exception here, but we need to make sure the purpose is documented, or at least we remember it 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely and I don't recall if benchmark.net isolates each iteration or not (I know it does for separate tests), so it could be that the benchmark is "lying" for full framework and it is hiding the allocation cost (so it might not be representative of real world scenarios).

It would be possible to take a MemoryAllocator as an optional parameter or to not have SosHeaderYCbCr take advantage of this compiler optimization (the single copy on class instantiation isn't terrible, but losing out on permanent pinning is a bit unfortunate) or to only not use the optimization on full framework if there are concerns about this regressing full framework.

IIRC, the ArrayPool<byte>.Shared is optimized for up to 2MB, since we use it for things like the JSON and XML readers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment explaining why we use that pool (signature matching) and move on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Also rebased onto the current HEAD.

byte[] sharedBuffer = ArrayPool<byte>.Shared.Rent(buffer.Length);
try
{
buffer.CopyTo(sharedBuffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it looks like we are loosing all the benefits of the copy-free initialization at this line...

Copy link
Member

@antonfirsov antonfirsov Mar 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the benchmarks show different results ... I wonder why? I guess this method is not a hot path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of netcoreapp, the method is actually virtual and various stream types can override it to avoid the copy. I'll double check the numbers on full framework to see if it hurts anything (and if so, will look at what can be done).

@tannergooding
Copy link
Contributor Author

BenchmarkDotNet=v0.11.3, OS=Windows 10.0.17763.379 (1809/October2018Update/Redstone5)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview4-010901
  [Host] : .NET Core 2.1.9 (CoreCLR 4.6.27414.06, CoreFX 4.6.27415.01), 64bit RyuJIT
  Clr    : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3362.0

Before:

Method TestImage Mean Error StdDev Median Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
'System.Drawing Jpeg' Bmp/Car.bmp 4.856 ms 0.1641 ms 0.4838 ms 5.064 ms 1.00 0.00 210.9375 - - 873.48 KB
'ImageSharp Jpeg' Bmp/Car.bmp 6.041 ms 0.0559 ms 0.0467 ms 6.022 ms 1.24 0.15 23.4375 - - 129.19 KB

After:

Method TestImage Mean Error StdDev Median Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
'System.Drawing Jpeg' Bmp/Car.bmp 4.909 ms 0.1418 ms 0.4180 ms 5.046 ms 1.00 0.00 210.9375 - - 873.48 KB
'ImageSharp Jpeg' Bmp/Car.bmp 6.071 ms 0.1181 ms 0.1406 ms 5.995 ms 1.28 0.11 23.4375 - - 129.19 KB

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing to see a change like this can yield such performance benefits. I'll need to learn more tricks like this!

@JimBobSquarePants
Copy link
Member

Merging this in. Thanks @tannergooding

@JimBobSquarePants JimBobSquarePants merged commit 5eb0122 into SixLabors:master Mar 26, 2019
@JimBobSquarePants JimBobSquarePants added this to the 1.0.0-rc1 milestone Mar 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants