White lines though images on certain pdfs #3351

TheLastRar · 2013-06-08T17:12:14Z

On pdfs where the creation software splits the image into smaller chunks, pdf.js will produce a white line between them while adobe reader renders them correctly.

Example pdf:
https://www.dropbox.com/s/gt1b2y9bgf06pam/Doc2.pdf

Edit: may be related to issue #3331 except with images instead of filled rectangles.

timvandermeij · 2014-02-14T20:23:42Z

They seem related, yes. I can reproduce this using Windows 7 x64, Firefox Nightly 30.0a1 (HWA on, 64 bit) and the latest PDF.js development version. I heard from @yurydelendik that it looks better on OS X for some reason, but the main problem seems to be that the math for adding the image chunks together is a bit off.

TheLastRar · 2014-02-14T21:55:07Z

This is one of the more common issues I experience with this viewer. I can probable add more example pdfs if you need.

timvandermeij · 2014-02-15T12:35:02Z

Thank you. We also have #3331, but adding more example PDFs here is always welcome.

TheLastRar · 2014-02-15T15:04:45Z

Chrome (using pdf.js web viewer) displays some of these better

https://www.dropbox.com/s/pwznkfix0vkwlz2/SI%20Lecture.pdf

https://www.dropbox.com/s/l2up0sc4jepo7ay/Debriefing%20and%20presentation%20on%20section%20drawing.pdf (Effecting both text and images)

https://www.dropbox.com/s/91knasncff7imhv/Lec2.pdf (Page 3 - Earth)

https://www.dropbox.com/s/trnaaknkx93vilx/Lec23.pdf (Most images)

https://www.dropbox.com/s/efqkus46fhtvl93/Lec1-3%20-%20Introduction.pdf (Effects text, see page 3)

https://www.dropbox.com/s/rdh5ph4gl7xu6bq/Lec14%2C16%20-%20Pollution.pdf (text and images)

https://www.dropbox.com/s/exu3hi6moumkj8g/Lab2%20-%20page2.pdf (Image on Page 2, even chrome has issues with this)

https://www.dropbox.com/s/m3h40cibxqcw6k0/Lec2.pdf (most images from page 7 onwards)

https://www.dropbox.com/s/49064yy91egq7j0/Lec3.1.pdf (images on page 3-4)

~~http://www.edexcel.com/migrationdocuments/GCE%20New%20GCE/N38210A-GCE-Mathematical-Formulae-Statistical-Tables.pdf~~ (Link dead, see wayback link below)
https://web.archive.org/web/20120917085705/http://www.edexcel.com/migrationdocuments/GCE%20New%20GCE/N38210A-GCE-Mathematical-Formulae-Statistical-Tables.pdf (blue banner on page 1)

Edit: also
https://www.dropbox.com/s/rxk1n87pmumrixl/Doc4%20-%20Lec7%20-%20Cermics.pdf (all slides) the background for the above pdf is black when it should be white. I see that there is a similar issue open for black backgrounds, do you want me to open a separate bug for this pdf anyway?

timvandermeij · 2014-02-15T22:11:35Z

Wow, thanks for the good test cases! I haven't checked all of them, but I could reproduce the ones I did check.

Regarding the black background: this is the first of those issues that I can actually reproduce on Windows. It might be best to just add that file with your configuration details (OS name and version, browser name and version, PDF.js version) at the existing issue.

TheLastRar · 2014-05-06T22:26:48Z

Updated Dropbox links

niknah · 2015-03-07T05:36:58Z

Here's another example. But it includes a solution.

To fix it up, I got the original image, resized it and paste it into the document that I made the pdf with.

I was using MSword, just changing the size in msword didn't help. But resizing the original image and then pasting it again fixed it. Below is an example of two different images resized...

http://www.pdf-archive.com/2015/03/07/test2200-2/test2200.pdf

I made the pdf with msword, pdfcreator v2.0.2

This problem doesn't happen in Google Chrome. But it happens in Firefox, and the evince app in linux.

brianholle · 2018-07-31T18:40:25Z

Has there been any progress made regarding this issue? Could someone point me in the right direction in regards to where PDF.JS begins to attach each object of the pdf onto the canvas?

I assume the issue is caused by incorrectly stitching multiple images together. In the first sample PDF provided, the main image is actually composed of 7 smaller/sliced images.

timvandermeij · 2018-07-31T21:02:32Z

I think the easiest way to debug this is to use gulp server locally and prepend #pdfBug=Stepper to the URL, for example http://localhost:8888/web/viewer.html#pdfBug=Stepper. This will open the PDF debugging utilities and allow you to step through the drawing operations so you can see which operation causes the defect. More information can be found at https://github.com/mozilla/pdf.js/wiki/Debugging-pdf.js#stepper.

brianholle · 2018-08-03T18:27:19Z

I have looked into this and I think I have found the problem, although, I am not 100% sure how drawing works when width is more accurate than pixels can account for, but i have an idea.

It appears that when calling .drawImage on our canvas object, we do so given the exact height and width of the object that we wish to draw (image, rectangle, etc). By doing this, without taking pixel width into account, it seems that we are leaving some pixels partially colored. When stitching multiple parallel images together, this leaves a visible gap between them.

Ex. Given a pixel width of 0.5 and an image with width 1.02, we are coloring 2 full pixels and then leaving the 3rd pixel partially transparent. The next image to be stitched then starts at the 4th pixel, leaving the gap that the 3rd pixel has generated.

Using this sample file (CompWiz_Page1.pdf), which draws many rectangles on the canvas to create a gradient effect, I have came up with the following solution.

In Canvas.constructPath(), we are creating the rectangle's that we wish to attach to our canvas. In order to account for Pixel Size, I made the following modifications (see comments):

BEFORE

    constructPath: function CanvasGraphics_constructPath(ops, args) {
      var ctx = this.ctx;
      var current = this.current;
      var x = current.x, y = current.y;
      for (var i = 0, j = 0, ii = ops.length; i < ii; i++) {
        switch (ops[i] | 0) {
          case OPS.rectangle:
            x = args[j++];
            y = args[j++];
            var width = args[j++];
            var height = args[j++];
            if (width === 0) {
              width = this.getSinglePixelWidth();
            }
            if (height === 0) {
              height = this.getSinglePixelWidth();
            }
            var xw = x + width;
            var yh = y + height;

AFTER

    constructPath: function CanvasGraphics_constructPath(ops, args) {
      var ctx = this.ctx;
      var current = this.current;
      var x = current.x, y = current.y;
      var origX, origY, pixelSize = this.getSinglePixelWidth();
      for (var i = 0, j = 0, ii = ops.length; i < ii; i++) {
        switch (ops[i] | 0) {
          case OPS.rectangle:
            origX = args[0];
            origY = args[1];
            //Round down to the nearest multiple of our pixel width
            x = Math.floor(args[j++]/pixelSize)*pixelSize; 
            y = Math.floor(args[j++]/pixelSize)*pixelSize;
            var width = args[j++];
            var height = args[j++];
            if (width === 0) {
              width = this.getSinglePixelWidth();
            }
            if (height === 0) {
              height = this.getSinglePixelWidth();
            }
            //Use the non rounded original x/y values to round up to nearest pixel width multiple.
            var xw = Math.ceil((origX + width)/pixelSize)*pixelSize; 
            var yh = Math.ceil((origY + height)/pixelSize)*pixelSize;

Doing this ensures that each pixel is fully colored and there is no transparency between stitched images.

I was able to fix the line issue that Doc2 (first example file attached in OP comment) has with his image using the same logic and modifying Canvas.paintJpegXObject() and Canvas.paintInlineImageXObject(). The following lines were added before writing the images to the canvas:

var pixelSize = this.getSinglePixelWidth();
h = Math.ceil(h/pixelSize)*pixelSize;
w = Math.ceil(w/pixelSize)*pixelSize;

If this is a viable solution, I think that it would need to be implemented before each drawing interaction with our canvas. If that is the case, I'd be happy to work on its implementation.

Snuffleupagus · 2018-08-03T19:06:11Z

this.getSinglePixelWidth();

If this is a viable solution, I think that it would need to be implemented before each drawing interaction with our canvas. If that is the case, I'd be happy to work on its implementation.

Please note that that particular method may be "expensive" (performance wise), so using it everywhere would probably require very careful benchmarking; please refer to e.g. the discussion starting in #4615 (comment) and the results in #4615 (comment).

timvandermeij · 2018-08-03T20:56:03Z

If you're interested in benchmarking this, you can use our benchmarking tool as described on https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes.

brianholle · 2018-08-06T17:26:15Z

Interesting, I just got caught up on the topic, I was unaware that this solution has already been thought about.

Is it possible to see at what number of calls to .getSinglePixelWidth() we start to see a regression in performance? I am not familiar with how the #pdfbug=Stepper works, but if pdf.js has access to the load operations and can count the number of calls to Restore/Transform/ShowText (the methods resetting our cache), we can determine the amount of times we would need to call .getPixelWidth().

If that value falls under our performance regression threshold, we can proceed to load the document with accurate pixel measurements.

Something like this would allow for higher quality document display in standard cases while avoiding the performance regression seen in extreme cases.

timvandermeij · 2018-08-06T21:38:16Z

I think the stepper should provide you with that. It shows you when and in which order it performs a drawing operation, for example restore/transform since they are native PDF commands.

Snuffleupagus · 2018-08-06T21:43:23Z

Is it possible to see at what number of calls to .getSinglePixelWidth() we start to see a regression in performance?

That sounds really difficult, since it seems like it'd be highly dependant on e.g. the complexity and the ordering of rendering OPS of particular PDF files; not to mention dependent on general computer/device performance.

but if pdf.js has access to the load operations and can count the number of calls to Restore/Transform/ShowText (the methods resetting our cache),

There would be overhead/complexity related to counting the operations as well, especially considering that only some ShowText commands are resetting the cachedGetSinglePixelWidth cache.

Something like this would allow for higher quality document display in standard cases while avoiding the performance regression seen in extreme cases.

That seems immediately problematic, since it would introduce inconsistent rendering behaviour for different kinds of PDF files. (Some files rendering correctly, and others not, doesn't seem like great UX.)

It might happen that it's deemed OK to make use of getSinglePixelWidth() more in canvas.js, even if it would result in a slight regression in edge-cases (such as e.g. PDF files containing complex maps).

In the past, smaller regressions have been accepted in return for significant rendering improvements (in multiple files); but without actual benchmarking results[1], especially with larger and more complex PDF files, it's mostly speculation at this point :-)

I didn't mean to deter you from looking into this with #3351 (comment); sorry if it came off that way!

Finally, one piece of advice (based on experience): Before you put a lot of time and effort into doing multiple benchmark runs, make sure that you run the entire test-suite first to check that there aren't any obvious regressions from your changes.[2]

[1] One small advice regarding https://github.com/mozilla/pdf.js/wiki/Benchmarking-your-changes. The example contains "rounds": 20, however in practice you often need to use at least "rounds": 50, to get statistically significant results.

[2] There's nothing worse than having to re-do entire benchmark runs just because of some simple error in code you wrote yourself. Don't ask me how I know this ;-)

brianholle · 2018-08-07T18:31:43Z

Thank you for the through reply. I have started to mess around with the benchmark testing. Are there specific files which are noteworthy? Or is it best to run the benchmark on the whole list of test PDF's?

timvandermeij · 2018-08-07T19:59:44Z

I think the PDF file in #2813 may be a good one for your benchmark since it's a complicated PDF with a lot of drawing operations. Other than the PDF files in this issue, I think taking a subset of let's say 20 random PDF files from the existing suite should be sufficient. Perhaps @Snuffleupagus has more ideas though.

brianholle · 2018-08-07T20:54:34Z

Here are the results when testing 50 rounds on the above document:

-- Grouped By browser, stat --
browser | stat         | Count | Baseline(ms) | Current(ms) | +/- |    %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | --- | ----- | -------------
Firefox | Overall      |    50 |        16139 |       16974 | 834 |  5.17 |              
Firefox | Page Request |    50 |           10 |          10 |   0 | -2.53 |              
Firefox | Rendering    |    50 |        16129 |       16963 | 834 |  5.17 |

What exactly is it that I am looking for from these results? Is there a rule of thumb as to what range of values from the baseline are acceptable?

brendandahl · 2018-08-07T23:57:23Z

What exactly is it that I am looking for from these results? Is there a rule of thumb as to what range of values from the baseline are acceptable?

Usually we look at the t-test column Result(P<.05). In this case it looks like the change isn't statistically significant since it's not labeled 'slower' or 'faster'. It's is kind of interesting that the mean of rendering is 5% worse and it's still not significant. Can you attach the json file that the benchmark generates?

brianholle · 2018-08-08T18:47:22Z

browser | stat         | Count | Baseline(ms) | Current(ms) |  +/- |     %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | ---- | ------ | -------------
Firefox | Overall      |    50 |        17427 |       16954 | -474 |  -2.72 |              
Firefox | Page Request |    50 |           11 |           9 |   -2 | -14.34 |              
Firefox | Rendering    |    50 |        17416 |       16944 | -472 |  -2.71 |

browser | stat         | Count | Baseline(ms) | Current(ms) |   +/- |     %  | Result(P<.05)
------- | ------------ | ----- | ------------ | ----------- | ----- | ------ | -------------
Firefox | Overall      |    50 |        18640 |       15158 | -3481 | -18.68 |        faster
Firefox | Page Request |    50 |           11 |          10 |    -1 |  -5.59 |              
Firefox | Rendering    |    50 |        18629 |       15148 | -3481 | -18.69 |        faster

I ran the benchmark 2 more times and above are the results. It seems strange to me that they differ so much from one another. Is this normal?

Here is the resulting json from the last current run (Renamed to .txt because .json uploading is not supported).
current.txt

brendandahl · 2018-08-08T21:31:12Z

Is this normal?

Yes, benchmarking browser's can be very fickle. One issue with the current benchmark setup is that it basically just re-runs rendering a page many times in the same browser page, which is not a realistic scenario in the real world (usually we'll just render a page once, so first time paint is the most important). What we really should probably do is restart the browser on each round or run reach round under a different domain so the caches are clean. This is all beyond the scope of this bug, but if you're interested in working on it, I'd be happy to help.

Just to be clear on your above results, did you run it so baseline = no changes and current = with your changes? It seems odd that now it is showing up as faster vs your previous results.

brianholle · 2018-08-08T21:37:25Z

That is a great point about the testing. After this is solved, I may dive into that to see what can be done.

For testing:
Baseline = Original code w/ no changes
Current = code w/ changes

For each test, I regenerated both the baseline and current results.

calixteman · 2022-04-28T11:24:48Z

There are different issues here:

one with images;
one with paths.

The underlying problems are slightly different.

About images, we must draw them at integer coordinates and with integer dimensions, according to MDN, it should help to improve performance and in the pdf context it should remove the thin lines we have between two tiles.
Anyway it should help to fix pdfs like:
Doc2.pdf

About paths, the situation is slightly different, after apply current transformation we must round and add 0.5.
The 0.5 will guarantee that 1px lines are really 1px lines: there is an explanation on MDN to explain how lines are drawing in a canvas:

The extra transparency added around lines can lead to some wrong color rendering with very close thin lines see bug 1657838.
So we probably need to round correctly coordinates ourselves which should induce a perf penalty but it could be more or less offset by a perf improvement when drawing, so for sure, we need to measure that.

…la#3351) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1264608; - it's only a partial fix for mozilla#3351; - some tiled images have some spurious white lines between the tiles. When the current transform is applyed the corners of an image can have some non-integer coordinates leading to some extra transparency added to handle that. So with this patch the current transform is applied on the point and on the dimensions in order to have at the end only integer values.

Use integer coordinates when drawing images (bug 1264608, issue #3351)

…la#3351) - it aims to fix https://bugzilla.mozilla.org/show_bug.cgi?id=1264608; - it's only a partial fix for mozilla#3351; - some tiled images have some spurious white lines between the tiles. When the current transform is applyed the corners of an image can have some non-integer coordinates leading to some extra transparency added to handle that. So with this patch the current transform is applied on the point and on the dimensions in order to have at the end only integer values.

FuriousProgrammer · 2024-03-06T04:48:11Z

Another way to reproduce this bug, but with rectangles again instead of images;

Write some code in Sublime Text, using a non-white background
Edit->Copy as HTML
Paste into a Word doc
Export as PDF

The Word doc renders the background of the pasted text as a solid color, and the resulting PDF looks correct in Adobe and Chrome, but Firefox is bleeding through some of the background color in between each line of the code.

issue_3351_sample.pdf

Snuffleupagus added the 4-pdf-unavailable label May 6, 2014

Snuffleupagus removed the 4-pdf-unavailable label May 7, 2014

Snuffleupagus mentioned this issue May 28, 2014

Faint white boxes in table #4860

Open

This was referenced Feb 9, 2018

Horizontal stripes in images #9464

Closed

PDF appears with lines through several images #9481

Closed

Snuffleupagus mentioned this issue Jun 20, 2018

Pdf gets rendered with stripes #9829

Closed

timvandermeij added the core label Jul 31, 2018

timvandermeij mentioned this issue Aug 3, 2018

PDF rendered incorrectly: white gap lines between neighbouring rectangles #9953

Closed

Snuffleupagus mentioned this issue Dec 10, 2019

Invalid color display of images in PDF, wrong in all browsers in different ways #11400

Closed

timvandermeij mentioned this issue Sep 18, 2021

PDF rendered black in FF, stripes in Chromium #6380

Closed

Snuffleupagus mentioned this issue Jan 5, 2022

Some PDFs don't look good #14419

Closed

Snuffleupagus mentioned this issue Feb 14, 2022

QR Code from TCPDF not displayed and printed correctly #3331

Closed

calixteman self-assigned this Apr 27, 2022

calixteman mentioned this issue Apr 28, 2022

Use integer coordinates when drawing images (bug 1264608, issue #3351) #14853

Merged

calixteman added a commit that referenced this issue Apr 29, 2022

Merge pull request #14853 from calixteman/white_lines

b10b8da

Use integer coordinates when drawing images (bug 1264608, issue #3351)

marco-c mentioned this issue Jul 18, 2023

White, horizontal lines appearing between images #16696

Closed

This comment was marked as off-topic.

Sign in to view

Snuffleupagus mentioned this issue Jan 19, 2024

VERTICAL LINE BORDER APPEARS BETWEEN LIST COLUMN TITLES #17542

Closed

Snuffleupagus mentioned this issue Feb 28, 2024

The image on the PDF has a Horizontal Line displayed on pdf.js, which doesn't exist if it's opened in Adobe Acrobat #17745

Closed

marco-c added this to PDF.js quality Mar 26, 2024

marco-c moved this to Low priority in PDF.js quality Mar 26, 2024

marco-c unassigned calixteman Mar 26, 2024

Snuffleupagus mentioned this issue Aug 28, 2024

[Bug]: A extra white line shows up in background #18660

Closed

Snuffleupagus mentioned this issue Oct 28, 2024

[Bug]: White horizontal lines are displayed on pictures #18969

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

White lines though images on certain pdfs #3351

White lines though images on certain pdfs #3351

TheLastRar commented Jun 8, 2013

timvandermeij commented Feb 14, 2014

TheLastRar commented Feb 14, 2014

timvandermeij commented Feb 15, 2014

TheLastRar commented Feb 15, 2014 •

edited

Loading

timvandermeij commented Feb 15, 2014

TheLastRar commented May 6, 2014

niknah commented Mar 7, 2015

brianholle commented Jul 31, 2018

timvandermeij commented Jul 31, 2018

brianholle commented Aug 3, 2018

Snuffleupagus commented Aug 3, 2018

timvandermeij commented Aug 3, 2018

brianholle commented Aug 6, 2018

timvandermeij commented Aug 6, 2018

Snuffleupagus commented Aug 6, 2018 •

edited

Loading

brianholle commented Aug 7, 2018

timvandermeij commented Aug 7, 2018

brianholle commented Aug 7, 2018

brendandahl commented Aug 7, 2018

brianholle commented Aug 8, 2018

brendandahl commented Aug 8, 2018

brianholle commented Aug 8, 2018

calixteman commented Apr 28, 2022

This comment was marked as off-topic.

FuriousProgrammer commented Mar 6, 2024

White lines though images on certain pdfs #3351

White lines though images on certain pdfs #3351

Comments

TheLastRar commented Jun 8, 2013

timvandermeij commented Feb 14, 2014

TheLastRar commented Feb 14, 2014

timvandermeij commented Feb 15, 2014

TheLastRar commented Feb 15, 2014 • edited Loading

timvandermeij commented Feb 15, 2014

TheLastRar commented May 6, 2014

niknah commented Mar 7, 2015

brianholle commented Jul 31, 2018

timvandermeij commented Jul 31, 2018

brianholle commented Aug 3, 2018

Snuffleupagus commented Aug 3, 2018

timvandermeij commented Aug 3, 2018

brianholle commented Aug 6, 2018

timvandermeij commented Aug 6, 2018

Snuffleupagus commented Aug 6, 2018 • edited Loading

brianholle commented Aug 7, 2018

timvandermeij commented Aug 7, 2018

brianholle commented Aug 7, 2018

brendandahl commented Aug 7, 2018

brianholle commented Aug 8, 2018

brendandahl commented Aug 8, 2018

brianholle commented Aug 8, 2018

calixteman commented Apr 28, 2022

This comment was marked as off-topic.

FuriousProgrammer commented Mar 6, 2024

TheLastRar commented Feb 15, 2014 •

edited

Loading

Snuffleupagus commented Aug 6, 2018 •

edited

Loading