Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept and return ReadOnlySpan<byte> instead of IntPtr #16

Closed
tzographos opened this issue Apr 10, 2022 · 3 comments
Closed

Accept and return ReadOnlySpan<byte> instead of IntPtr #16

tzographos opened this issue Apr 10, 2022 · 3 comments

Comments

@tzographos
Copy link

Hello, first of all let me say that this is a great library and especially the multithreaded render.
Is there somewhere on the roadmap the ability to expose an API with Span<byte> instead of IntPtr ?

The reason is that in most of the cases, after rendering an e.g. PDF file, we have to either change the exported file format or perform some sort of image manipulation (with another library e.g. ImageSharp). Therefore, it would be beneficial to avoid marshalling memory back, holding twice the amount.

The only way I know to expose a Span<byte> out of an IntPtr is by going into unsafe mode and casting IntPtr to a void* pointer but this would make my project require an /unsafe build, which I was really hoping to avoid.

Do you have any ideas?

Thanks,
--Theodore

@arklumpus
Copy link
Owner

Hi! I'm glad you're finding the library useful!

This sounds like a good idea, I can start working on it when I get back to my computers (likely next week). I will update this issue when it's done.

@arklumpus
Copy link
Owner

Hi! Version 1.5.0 now has overloads of the render methods (both for the MuPDFDocument and the MuPDFMultiThreadedPageRenderer) that return Spans.

However, you should note that:

  • If you're using ImageSharp, ImageSharp's LoadPixelData method will always create a copy of the pixel data, regardless of how you invoke it (e.g. with a byte[] or with a Span<byte>).

  • In addition to returning a Span<byte>, the new methods also return an IDisposable. This keeps track of the unmanaged memory that is pointed to by the Span. You must ensure that the lifetime of this IDisposable matches the lifetime of the Span, preferably by Disposeing it when you have finished working with the image. If the IDisposable goes out of scope while you still have access to the Span, the GC may collect it and trigger its finaliser at any time - as a result, the memory to which the Span points will be released, while you still hold an apparently valid reference to it.

  • Arrays of Spans are not allowed, therefore the MuPDFMultiThreadedPageRenderer.Render method returns a delegate instead. To get the Span<byte> corresponding to the i-th tile, you need to invoke the delegate, passing i as a parameter:

     MuPDFMultiThreadedPageRenderer renderer = ...
     
     RoundedSize targetSize = ...
     Rectangle region = ...
     PixelFormats pixelFormat = ...
     
     MuPDFMultiThreadedPageRenderer.GetSpanItem tiles = renderer.Render(targetSize, region, out IDisposable[] disposables, pixelFormat);
     
     int tileCount = disposables.Length;
     
     for (int i = 0; i < tileCount; i++)
     {
     	// Get the i-th tile.
     	Span<byte> tilePixels = tiles(i);
     	
     	// Do something with tilePixels.
     	
     	// Release the memory where the i-th tile is stored.
     	disposables[i].Dispose();
     }
  • If you use the overload of the MuPDFDocument.Render method that returns a byte[], the pixel data will NOT be marshalled by MuPDFCore. Thus, there should not be much of a performance penalty if you use this overload instead of one with an IntPtr parameter or one that returns a Span<byte>.

All in all, if you just need to manipulate the image using ImageSharp, it is probably better if you just use the MuPDFCore overload that returns a byte[]; in this way, you don't have to worry about memory management (since this a "normal" byte array that will be collected by the GC as usual), and ImageSharp is going to copy the data to its own memory storage anyways.

The MuPDFMultiThreadedPageRenderer does not have a Render overload working with byte[] arrays, but it's relatively straightforward to get an IntPtr from a byte[] staying in safe mode:

int bufferSize = ...

byte[] buffer = new byte[bufferSize];

GCHandle bufferHandle = GCHandle.Alloc(buffer, GCHandleType.Pinned);
IntPtr bufferPointer = bufferHandle.AddrOfPinnedObject();

// Do stuff with bufferPointer.

bufferHandle.Free();

As long as you don't wait too much before freeing the GCHandle, the side effects on the GC shouldn't be too relevant.

In any case, if you need better performance, you may need to look into some other graphics library that lets you work directly with pixel data stored in unmanaged memory (perhaps something like SkiaSharp?)...

@tzographos
Copy link
Author

This is great! Thanks for the detailed explanation. I will give it a spin right away!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants