IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file [RESOLVED]

Post any defects you find in the released or beta versions of the ImageMagick software here. Include the ImageMagick version, OS, and any command-line required to reproduce the problem. Got a patch for a bug? Post it here.
Locked
wiemag
Posts: 4
Joined: 2020-02-24T06:48:54-07:00
Authentication code: 1152

IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file [RESOLVED]

Post by wiemag »

System: Arch Linux
kernel 5.5.5-arch1-1 #1 SMP PREEMPT Thu, 20 Feb 2020 18:23:09 +0000 x86_64 GNU/Linux
imagemagic versions checked:

Code: Select all

imagemagick-7.0.9.23-1-x86_64.pkg.tar.zst - works
imagemagick-7.0.9.24-1-x86_64.pkg.tar.zst - bug
imagemagick-7.0.9.25-1-x86_64.pkg.tar.zst - bug
imagemagick-7.0.9.26-1-x86_64.pkg.tar.zst - bug
Ghostscript installed: ghostscript 9.50-2

I use 'convert' co convert non-monochrome scanned PDF's into monochrome ones, using the Group4 compression method.
The bug was first found in package imagemagick-7.0.9.24-1-x86_64. The following versions did not remove the bug.
The last working version is imagemagick-7.0.9.23-1-x86_64.

Typical command I use for a4 page size is

Code: Select all

$  convert -density 288 -threshold 78% -monochrome INPUT.pdf -compress Group4 OUTPUT.pdf
For a3, a2, I tend to increase the density in order to increase the resolution and be able to see all the details.

The result:
- the OUTPUT file is created
- the OUTPUT files is usually rendered correctly only in the upper part; the rest of the page remains blank/white
- the part rendered is full width, and the size of the top part of the page depends on the value of the 'density' parameter.
The higher the density I use, the shorter is the top stripe of the rendered image. A density of 600 leads to about less-than-10%-hight rendered. While a density of 288 allows converting about 50% of the page.

I have not checked the actual "density" of the INPUT.pdf, nor its orientation/rotation.
Last edited by wiemag on 2020-03-14T16:43:50-07:00, edited 1 time in total.

User avatar
fmw42
Posts: 26383
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file

Post by fmw42 »

try

Code: Select all

convert -density 288  INPUT.pdf -threshold 78% -monochrome -compress Group4 OUTPUT.pdf
If that does not work, then post a link to one of your input PDF files that fails.

Also what version of Ghostscript are you using? Imagemagick offloads reading of PDF files to Ghostscript.

Is your PDF in RGB or CMYK and does it have transparency?

wiemag
Posts: 4
Joined: 2020-02-24T06:48:54-07:00
Authentication code: 1152

Re: IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file

Post by wiemag »

Changing the order as you suggested does not have any effect. The results are exactly the same.

I have made some experiments to find some correlations:
- Page orientation does not have any effect -- both vertical and horizontal pages are converted the same way
- Number of pages in a pdf does not have any effect (all pages are converted the same way)
- Page size (a4,a3,...) does have some effect, for example a4 pages are converted correctly with density=288, but if the density is higher (density=360) the bug affects the render. A3 page converted gets shortened by half with a command with density=288, but it is converted in full with density=144 (not usable due to the lost resolution, though).
- Pages converted can be re-converted, and if the density in the second command is higher the re-converted image gets shortened more.

I checked the RGB / CMYK with 'pdfimages -list'. Most of the scans are grey, but RGB are converted the same way. I do not have any CMYK scans.

Ghostscript version 9.50-2

Finally, some examples:
These PDFs are uploaded to google disk, and are displayed in a different way in Firefox (73.0.1, 64-bit) form how they are displayed in evince (3.34.2-1).
Evince displays blank/white bottom page, while the same areas are displayed with irregular white and black stripes in Firefox.

a4_vert_orig.pdf https://drive.google.com/file/d/1IS2-23 ... sp=sharing
a4_vert_432. pdf https://drive.google.com/file/d/1X7-Ntz ... sp=sharing
a4_vert_288.pdf https://drive.google.com/file/d/1zxbmne ... sp=sharing
a4_vert_360.pdf https://drive.google.com/file/d/175ScL9 ... sp=sharing
a4_vert_360_360 (re-converted with density=360) https://drive.google.com/file/d/1jWrTkH ... sp=sharing
a4_vert_360_432.pdf (re-converted, density=432) https://drive.google.com/file/d/1htNbXJ ... sp=sharing

a3_2_horiz_orig.pdf https://drive.google.com/file/d/1xlVzqu ... sp=sharing
a3_2_horiz_dens504.pdf https://drive.google.com/file/d/1RvxN6y ... sp=sharing

snibgo
Posts: 13034
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file

Post by snibgo »

I can't replicate your bad results with IM 7.0.8-64 on Windows 8.1, with Ghostscript v9.19, on a 12 GB laptop, viewing the output PDF with Adobe Acrobat Reader, eg:

Code: Select all

magick -density 432 a4-vert_orig.pdf -threshold 78% -monochrome -compress Group4 out.pdf
At density 432, the image is 3570x5046, which shouldn't be a problem. How much free memory does your computer have? Higher density needs more memory, of course.
snibgo's IM pages: im.snibgo.com

wiemag
Posts: 4
Joined: 2020-02-24T06:48:54-07:00
Authentication code: 1152

Re: IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file

Post by wiemag »

I had the same results on my two laptops:
- Lenovo y530 Legion, 16GB ram
- Asus c550, 4 GB ram.
Both laptops run on Arch Linux kernel 5.5.5. Both use gs 9.50.
I cannot try this conversion on Windows, because I have removed Windows from the Asus and never had one on the Lenovo.

I have now downgraded imagemagick to 7.0.9.23 on my Lenovo, and the results are correct.
Versions .24, .25, and .26 (the latest available in the official Arch linux repository) do not work, or I should rather say they reveal/exhibit/show/have/manifest(?) the bug. (I'm not a native speaker, no idea what words go with "the bug"). :)
Might it be something to report on Arch-Linux bug forum, rather than here?
It's not Ghostscript, because gs 9.50 does work with IM version .23.

User avatar
fmw42
Posts: 26383
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file

Post by fmw42 »

I can confirm the issue using IM 7.0.9.25 Q16 Mac OSX with Ghostscript 9.50

Code: Select all

magick -density 432 a4-vert_orig.pdf -threshold 78% -monochrome -compress Group4 out.pdf
produces an image where the top half is correct and the bottom half is fully black.

However, if I save to PNG, the full output is correct. So that tells me it seems to be in the PDF writer.

User avatar
magick
Site Admin
Posts: 11254
Joined: 2003-05-31T11:32:55-07:00

Re: IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file

Post by magick »

Thanks for the problem report. We can reproduce it and will have a patch to fix it in GIT master branch @ https://github.com/ImageMagick/ImageMagick later this week. The patch will be available in the beta releases of ImageMagick @ https://www.imagemagick.org/download/beta/ later this week.

wiemag
Posts: 4
Joined: 2020-02-24T06:48:54-07:00
Authentication code: 1152

Re: IM-7.0.9-24, 25 and 26: convert PDF to PDF converts only part of PDF file [RESOLVED]

Post by wiemag »

Sorry for not reporting back right away. The problem was fixed by the 7.0.9.27 update.
I use 7.0.10, and all's well.

Locked