resolution of PDF files forced to 72x72 even when -density is used

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
atariZen
Posts: 25
Joined: 2016-02-09T12:58:42-07:00
Authentication code: 1151

resolution of PDF files forced to 72x72 even when -density is used

Post by atariZen » 2018-03-04T08:35:50-07:00

I have a 600 dpi PBM file, and I simply want to wrap in a PDF with no changes. I would expect this to work, but it gets downsampled to 72x72:

Code: Select all

$ convert source.pbm target.pdf
$ identify -verbose target.pdf | grep -i reso
  Resolution: 72x72
So I also tried to brute-force it to maintain the 600 dpi:

Code: Select all

$ convert -units pixelsperinch source.pbm -density 600 target.pdf
$ identify -verbose target.pdf | grep -i reso
  Resolution: 72x72
I also tried replacing "-density" with "-resample" in the above attempt and same result. I'm first wondering if perhaps it's correct, and the "identify" command is lying when it comes to PDF containers. So I tried extracting the images:

Code: Select all

$ pdfimages -all target.pdf img
$ identify -verbose img-000.jpg | grep -i reso
<no output>
The first bizarre finding is that the extracted images are JPG, considering the input images were pbm and no changes should be made to them. Then the next astonishment is that these JPG images have no resolution (yet they display just fine). However, in the GUI tool that displays the JPG, the properties are said to be 300 dpi.

snibgo
Posts: 10272
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: resolution of PDF files forced to 72x72 even when -density is used

Post by snibgo » 2018-03-04T09:41:04-07:00

When raster images are wrapped inside a PDF, there are multiple resolutions: that of each raster image, and that of the overall PDF. "identify" reports just the overall PDF resolution. pdfimages reports the resolution of each raster image.

You should use "-units", eg:

Code: Select all

f:\web\im>%IM%convert toes.png -density 600 -units pixelsperinch t.pdf

f:\web\im>pdfimages -list t.pdf
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image     267   233  rgb     3   8  image  no         8  0   600   601  156K  86%

Why is y-ppi 601? I don't know.

"-compress" will direct the compression method, if any.

Some (old) versions of pdfimages only create JPG outputs.
snibgo's IM pages: im.snibgo.com

atariZen
Posts: 25
Joined: 2016-02-09T12:58:42-07:00
Authentication code: 1151

Re: resolution of PDF files forced to 72x72 even when -density is used

Post by atariZen » 2018-03-05T11:51:07-07:00

Thanks snibgo, you've cleared some things up. I'm happy to hear about "pdfimages -list".. that's quite useful.

I now have a working solution that's verifiable. But I will mention some annoyances to warn others, perhaps to also serve as a note to developers:

* ImageMagick alters the resolution when it's not told to do so. This fails the rule of least astonishment. E.g.

Code: Select all

$ convert source_600dpi.pbm target_72dpi.pdf
$ pdfimages -list target_72dpi.pdf
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    5100  6601  gray    1   8  image  no         8  0    72    72  690K 2.1%
That's not good. I didn't tell it to downsample 600dpi to 72dpi. But I'm glad I can at least force convert to do the right thing using -density.

* The "identify -verbose" command often gives no resolution for raster images, which must have a resolution. And for PDFs, you say it gives the "overall" resolution, but when the PDF is nothing other than a single raster image and nothing else, I expect the overall resolution to match that of the embedded image. Since it's always showing 72dpi, I suspect the PDF may contain a resolution for rendering/display property. No big deal, but a PDF is perhaps the one case where it would actually be sensible for the identify command to omit resolution, and in fact it's giving something that mismatches the objects inside.

* Regarding pdfimages, my version supports the "-all" parameter, which extracts images without conversion. When I convert a pbm to a pdf, ImageMagick apparently converts the pbm to a png file before wrapping it in a PDF even if I supply "-compress none". Unless perhaps there is some inherent problem with embedding pbm files in a PDF, this is unexpected.

Anyway, I can live with these things. Thanks for the help.

muccigrosso
Posts: 55
Joined: 2017-10-03T10:39:52-07:00
Authentication code: 1151

Re: resolution of PDF files forced to 72x72 even when -density is used

Post by muccigrosso » 2018-03-05T12:40:13-07:00

atariZen wrote:
2018-03-05T11:51:07-07:00
* The "identify -verbose" command often gives no resolution for raster images, which must have a resolution. And for PDFs, you say it gives the "overall" resolution, but when the PDF is nothing other than a single raster image and nothing else, I expect the overall resolution to match that of the embedded image. Since it's always showing 72dpi, I suspect the PDF may contain a resolution for rendering/display property. No big deal, but a PDF is perhaps the one case where it would actually be sensible for the identify command to omit resolution, and in fact it's giving something that mismatches the objects inside.
Yeah, PDFs are a pain. But it does make sense that they have their own resolution. Imagine taking a small 1" square high-res image and putting into a letter-sized PDF where it occupies the whole page. What's the resolution of the image now? It's not the same as it would be if you extracted the image from the PDF. Or the opposite case in which you take a large low-res image and squeeze it down so that it fits onto a PDF page.

In any case, I always use pdfimages to get images out of PDFs. You can extract most of them in their native formats, though not jbig2, which is pesky because it gets used a lot for real 2-bit images in my experience. A fax-quality tiff comes close, but maybe double the size.

snibgo
Posts: 10272
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: resolution of PDF files forced to 72x72 even when -density is used

Post by snibgo » 2018-03-06T01:55:45-07:00

atariZen wrote:ImageMagick alters the resolution when it's not told to do so. [...]
$ convert source_600dpi.pbm target_72dpi.pdf
The PBM format has no resolution metadata. Including "600dpi" in the name doesn't make it so. So the image resolution hasn't changed, but merely been set to 72 DPI.
atariZen wrote:I didn't tell it to downsample 600dpi to 72dpi.
"Downsample" implies that pixels have been re-sampled. When this happens, the number of pixels changes. Setting a different density doesn't also re-sample or downsample. The "-resample" operation changes both the density and number of pixels.
atariZen wrote:... when the PDF is nothing other than a single raster image and nothing else, I expect the overall resolution to match that of the embedded image.
That's a common expectation, but mistaken.
snibgo's IM pages: im.snibgo.com

Post Reply