-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pixels method to ImageDecoder #861
Comments
Streaming image decoding is one of the most desirable items missing, indeed. But the proposed interface appears flawed in several ways:
That doesn't mean the interface can't work. But in the current state, I'd rather start with specialized implementations to proof that this can satisfy the usability criterions of performance and memory requirements before adding it as a trait method. |
I'd also be interested in understanding better what use cases would benefit from streaming decoding over calling |
It seems like generic streaming decoding (and encoding) is a really hard problem because of different ways different formats store pixels. I would guess the best solution would be some sort of @HeroicKatora I do agree that my proposal isn't prefect, but it should work as a starting point for a discussion. For the @fintelia for pngs having a pixel iterator would be much more efficient than calling I will try to create some benchmarks for low memory decoding and encoding with something that looks like this and see if I can manage to get any satisfactory results. |
@fintelia Oh yes, definitely. That reminds me, |
@HeroicKatora I have created a simple benchmark here. It tests an iterator like described with the incremental png loading in #860. It simply sums up the r, g and b components for each pixel, using either The streaming iterators are slightly slower, however they use far less memory (0.5mb vs 680mb). I think that this shows that something in the lines of what proposed could be used to process some images in a streaming fashion with minimal memory usage. |
@birktj Png is the best example for your case to choose here because decoding is highly non-trivial. I expect all of the timing here to be dominated by the deflate time, and the compuation overhead for actually summing pixels vanishes to some degree within that. On the one hand, that could be interpreted as a plus, on the other hand this is specific to this use case of streamed processing. The test implementation looks very nice though and the memory win is definitely intriguing–and as expected. I'd prefer not creating a Edit: Don't know if I can do it this week, definitely next. |
Another approach would be to add a required constructor to the Pixel trait to convert from bytes in some colortype to that pixel type. Then DecoderPixels could avoid allocating boxes, and the user would have an easier time because they wouldn't have to match over possible pixel formats: trait Pixel {
fn convert_from_bytes(slice: &[u8], colortype: ColorType) -> P;
}
struct DecoderPixels<R, P: Pixel> {
reader: R,
colortype: ColorType,
buffer: Vec<u8>,
}
impl<R: Read, P: Pixel> Iterator for DecoderPixels<R> {
type Item = Self::P;
fn next(&mut self) -> Option<Self::Item> {
self.reader.read_exact(&mut self.buffer).ok()?;
Some(P::convert_from_bytes(&self.buffer, self.colortype))
}
} |
I have been looking a bit into incremental image processing and I think maybe some sort of I guess this is sort of what you are thinking of when you say "combine view with decoder". I envision some sort of trait ChunkedImage {
type Pixel: Pixel;
fn width() -> u32;
fn height() -> u32;
fn map_chunks<F: Fn(View) -> Image>(self, f: F) -> MapChunkedImage<Self, F>;
fn pixels()...
fn resize() ...
...
} |
In my case, I'm interested in converting images on the fly into a different format (a private project I'm working on, I can't elaborate). Currently my only option is to decode the image into memory and then process it line by line, dropping the image when I'm done. This not only uses a lot of unnecessary memory, and but is very CPU cache unfriendly, and so is slower than it could be. I could load the image in sub-rectangles as you describe, but that doesn't sound like it would improve performance any over loading once and processing, but I may try it. One other use that I can think of that would not benefit from the sub-image idea is speeding up thumbnail image generation, scaling the image as it is decoded should be a performance and memory use win. I do think providing the decoded image data in chunks as @birktj suggests (in concept anyway) is the way to go for performance reasons. This is how some C++ code I created in the past operated, it worked well. This is all optimization anyway, so if some formats (i.e. planar types) do not support incremental decoding, I'd rather just fall back to the line-by-line method than not have this functionality at all. |
@ambihelical It sounds to me like your usecase is extremely more specialized than the generic interface that This boils down to: Do the basic work first, integrate the functionality to offer this amount of control into as many format specific crates as possible. You could use the specialized versions first, which would provide feedback on the chosen interface, and in the worst case have the wrapper yourself. The first crate here is probably It's now up to determining the common denominator for other formats as well and build an abstraction. My biggest potential doubts with the current version, and the largest difference compared to |
When working with the
ImageDecoder
trait you only get a raw byte buffer or a raw byte stream. If you want to work with pixels you either have to useImageBuffer
and friends or manually convert the byte buffer/stream to pixels. However sometimes the image may not fit in memory, but you would like to work with pixels. For this usecase i propose to add apixels
method to theImageDecoder
trait, it may look something like this:The text was updated successfully, but these errors were encountered: