-
Notifications
You must be signed in to change notification settings - Fork 692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative PNG alpha-channel extraction #600
Conversation
All of this is really exciting, hopefully you can get the tests to go green and figure out why we're slow on JRuby, but otherwise I'm very interested in this approach. For the MRI memory variation, this may be due to mark-and-sweep GC running "whenever it gets around to it". So although there may be leak, seeing memory spikes in Ruby are not uncommon whenever there a bunch of short-lived objects that haven't been GCed yet. |
@cheba Which JRuby did you run on? I haven't even run my own optimizations there yet, but I'd like to be consistent. Either 1.7.8 or 1.7.9 would be OK. Earlier than that and we run into JRuby bugs affecting Prawn's font stuff. |
@sandal jruby 1.7.8 (1.9.3p392) 2013-11-14 0ce429e on Java HotSpot(TM) 64-Bit Server VM 1.6.0_65-b14-462-11M4609 [darwin-x86_64] |
JRuby 1.7.8 seem to be broken. JRuby 1.7.9 is much faster. On the other hand, 1.7.9 is inconsistent between runs on memory side. Sometimes it's 230MB and sometimes it's over 400MB. It's hard to get any meaningful improvement (or otherwise) ratios. |
Still, we are seeing less memory usage overall on JRuby 1.7.9 with this patch than without it. I would be OK with requiring JRuby 1.7.9+ since it's a bug fix release anyway. Please work on figuring out the spec failures, and revising test coverage where necessary. This seems more promising than my branch at the moment. |
I think you're still seeing GC related memory fluctuations: we do care about peak utilization but it may be hard to get a clear sense of how much of this is actual memory used vs. garbage not being collected. Anyway, I feel like this definitely is an improvement over what we have now, so revise the tests as needed and let's see if we can try to get this merged before 0.14.0 next month. |
Are you measuring jruby memory usage using activity monitor or similar? I'm no JVM expert, but I seem to recall it can allocate itself a large heap
|
@yob is correct. You can measure JRuby memory usage most accurately using a tool like jvisualvm, which will also allow you to take snapshots to look at individual object allocations. |
@yob Yes, I just watch Activity Monitor. Simply to have some consistent numbers between implementations. Though, it might be useful to have some better numbers to measure the changes. Can anybody share a short HOWTO on how to measure actually used memory on JVM? It certainly looks like what you've described. Memory usage doesn't fluctuate much on JRuby. |
According to PNG spec filtering happens on a per-byte basis and uses corresponding bytes from other pixels. This means that we don't have to actually unfilter all the data. Instead it's possible to split pixel data into planes. A few points to note: * Filters are reused from original image * Resulting opaque PNG uses the same filtered data * Resulting mask PNG image uses the same filteres data * Both resulting images are valid PNG images * All the unfiltering masth is skipped which results in great boosts Here are some benchmarks: RBX ========================================================== Original -------- user system total real opaque: 2.724633 0.541629 3.266262 ( 1.698649) transparent: 340.380027 9.747552 350.127579 (326.535695) Mem: 225-227 MB Modified -------- user system total real opaque: 2.675154 0.526945 3.202099 ( 1.706802) transparent: 77.626282 0.724561 78.350843 ( 72.843238) Mem: 95-97 MB MRI ========================================================= Original -------- user system total real opaque: 1.900000 0.190000 2.090000 ( 2.206105) transparent: 339.740000 6.330000 346.070000 (349.369761) Mem: 110-130 -> 150-180 MB (Leak?) Notes: Mem fluctuates widely. Median steadily rises. Mem leak? Modified -------- user system total real opaque: 1.780000 0.170000 1.950000 ( 1.964678) transparent: 50.380000 0.350000 50.730000 ( 51.129041) Mem: 36-38 MB JRuby ========================================================= Original -------- user system total real opaque: 5.390000 0.270000 5.660000 ( 2.817000) transparent: 564.830000 5.690000 570.520000 (474.866000) Mem: 287.4 -> 298, 293 -> 312.2 MB (Leak?) Modified -------- user system total real opaque: 5.190000 0.210000 5.400000 ( 2.497000) transparent: 3559.290000 10.180000 3569.470000 (3614.186000) Mem: 197.9 -> 211, 185.2 -> 187.2 MB (Leak?) user system total real opaque: 5.440000 0.240000 5.680000 ( 2.985000) transparent: 3553.670000 12.020000 3565.690000 (3668.508000) Mem: 224, 223.1 MB Notes: Mem steadily rises, no fluctuation. Sometimes it drops.
@sandal I consider this pull request to be completed. Please review. |
I merged it! In doing so, I squashed everything down into a single commit to make reverting easier if we experience problems. Two points:
|
|
Yes, we want to be able to have all contributors make commits to each other's feature branches. This makes it so that simple changes can be directly applied, speeding up feedback loops. I think it's still safe to rewrite history as needed as long as you only affect the commits on your branch and not master or stable. Also, we can always rebase before we merge, as I did with this branch. I'm going to merge this into stable now, and it will be included in the 0.13.1 release, whenever that happens. Thanks! |
Got it. Will use topic branches. Thanks for merging. |
According to PNG spec filtering happens on a per-byte basis and uses corresponding bytes from other pixels. This means that we don't have to actually unfilter all the data. Instead it's possible to split pixel data into planes and reuse filters.
A few points to note:
Benchmarks
Benchmark is done by creating a document and adding a PNG image, repeated 100 times.
Mem is recorder by watching Activity Monitor. Values are not exact and provided just to illustrate approximate scale.
RBX
Original
Mem: 225-227 MB
Modified
Mem: 95-97 MB
Change
Execution time: 4.47x faster
Memory: 57% less
MRI
Original
Mem: 110-130 -> 150-180 MB (Leak?)
Notes: Mem fluctuates widely. Median steadily rises. Mem leak?
Modified
Mem: 36-38 MB
Change
Execution time: 6.82x faster
Memory: 62% - 79% less
JRuby 1.7.8
Original
Mem: 287.4 -> 298, 293 -> 312.2 MB (Leak?)
Modified
Mem: 197.9 -> 211, 185.2 -> 187.2 MB (Leak?)
Mem: 224, 223.1 MB
Notes: Mem steadily rises, no fluctuation. Sometimes it drops.
Change
Execution time: 6.25x slower
Memory: 25%—34% less
JRuby 1.7.9
Original
Mem: 298MB
Modified
Mem: 234.1 MB
Change
Execution time: 22.87x faster
Memory: 21% less
PNG-related specs are broken. That is expected because "unfiltered" data is different with this implementation.
Also note that this code is extremely slow on JRuby. I'm looking into it.