I have a PDF file which contains pages of images. I am writing an application, which uses the magickcore API, that will read the PDF and compute a new PDF file with modified images.
By default, ImageMagick reads a PDF file with a density of "72x72". If I read a PDF with images of higher resolution (e.g., 150x150), the images are down sampled, resulting in a very low quality image, which is very annoying. This has been observed before by many others, and recently posted in a discussion (viewtopic.php?f=1&t=16541 ). One solution is to specify the density before reading the file. For example:
exception = AcquireExceptionInfo();
image_info = CloneImageInfo((ImageInfo *) NULL);
(void) strcpy(image_info->filename, argv[0]);
image_info->density = AcquireString("150x150"); // YUCK!!!!!
images = ReadImage(image_info, exception);
However, I would prefer to not specify the density, because I don't know what it will be given an arbitrary PDF file. However, I know it's possible to compute the resolution of an image from the information contained just in a PDF. This is because Adobe Acrobat is able to read a PDF file, know the size of the page and images contained in it, and and display full resolution of the images in the file. Further, I do not want to ask users of my program to specify the "density". They won't know what that means, they don't care to know, and will probably get it wrong every time. They want images to be full resolution.
Looking at the ImageMagick API, I do not see an apparent way to compute the resolution. Does anyone know how using the magickcore API?
If no one has a solution, I do see one, extremely yucky alternative: parse the PDF file and compute it myself. I would prefer not to do this, since ImageMagick already knows the structure of PDF files. However, it is possible using the following steps:
1) Read a PDF file as text. Find a /Page, say the first one, and get the /Content object. E.g.,
/Type /Page
/Parent 2 0 R
/Resources <<
/XObject << /Im1 22 0 R >>
/ProcSet 20 0 R >>
/MediaBox [0 0 918.24 683.04]
/CropBox [0 0 918.24 683.04]
/Contents 18 0 R
/Thumb 25 0 R
=> object 18.
2) Assume that the content on the page is just an image object. Get scaling of content image object. E.g.,
18 0 obj
<<
/Length 19 0 R
>>
stream
q
918.24 0 0 683.04 0 0 cm
/Im1 Do
Q
endstream
endobj
==> object Im1 is scaled using 918.24 per sample units.
3) Get image size in sample units. E.g.,
/Type /XObject
/Subtype /Image
/Name /Im1
/Filter [ /RunLengthDecode ]
/Width 1913
/Height 1423
/ColorSpace 24 0 R
/BitsPerComponent 8
/Length 23 0 R
=> Image is 1913 x 1425 sample units.
4) Compute resolution
1913 / 918.24 * 72 = 150
(Assumption: 1⁄72 default user space resolution.)
(see http://www.adobe.com/devnet/acrobat/pdf ... 0_2008.pdf )