Strange differences in file size when converting TIF to PDF

Questions and postings pertaining to the development of ImageMagick, feature enhancements, and ImageMagick internals. ImageMagick source code and algorithms are discussed here. Usage questions which are too arcane for the normal user list should also be posted here.
Locked
laggflor

Strange differences in file size when converting TIF to PDF

Post by laggflor »

Hello!
I dont know if this is a bug - so I want to ask the developers here...
I have a strange behavour using ImageMagick convert.exe:

I tried 3 different methods to convert from TIF to PDF:
Source-File ist a scanned TIF-image/Color/300 dpi

Code: Select all

1. convert.exe in.TIF out1.PDF
2. convert.exe in.TIF PS:- | convert.exe - out2.PDF
3. convert.exe in.TIF out3.PS && convert.exe out3.PS out3.PDF && del out3.PS
The file sizes I get differ:

Code: Select all

1. 3167 KB (100 %)
2. 1106 KB ( 35 %)
3.  857 KB ( 27 %)
The following facts are interesting:
1. So the third method generates a 3.7 times smaller output. The quality is lower as in the fist test, but not as much as the space saved I think.
2. The second and third test take much longer as the first.
3. Why are the second and third test different? The only difference is that I'm using pipes instead an temporarly file.

Any Ideas why this happens?

Thanks
Florian Lagg
http://www.lagg.at

btw: I searched the forum - but it's not easy to search because "PDF" is not searchable here. Kindly sorry me if there is a double-posting. Just point me there. Thx.

laggflor

Re: Strange differences in file size when converting TIF to PDF

Post by laggflor »

I have to add something here:

If the source file is an TIFF with 300dpi full colors - the output from method 2 and 3 is almost as good as in method 1.
If the source file is an TIFF coming from an fax machine (I don't know the exact settings) the output of method 2 and 3 is unreadable.

I will implement both methods now in my application and will let the user define which method to apply depending on image properties.

Anyway: Does anyone have an answer why
* the files from the convert TIFF --> PDF directly are so big? (Bigger as in commercial products i know)
* there is a difference - I thougt ImageMagick uses the algorithm TIFF->PS->PDF internally (because Ghostscript is needed for this)


Any Idea will be welcome.
Have a nice day.
Greetings Florian http://www.lagg.at

User avatar
anthony
Posts: 8884
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Strange differences in file size when converting TIF to PDF

Post by anthony »

Ghostscript is only used by IM for reading. For writing PS and PDF, Im will generate the file format directly.

The problem is postscript (and PDF) is a free-form computer language. A full language with all the control loops, functions and other constructs. This is why IM uses ghostscript.

For writing however it only has to write appropriate code, whcih is simple, so it can do it directly, without needing ghostscript.


As for the 3.7 size reduction. Try setting an appropriate -density on the postscript read.
Postscript being a vector language does not have a 'resolution' as such. though any internal rasters in it (the image) can have a 'ideal' resolution or density. That is the reason for the quality loss. You are reading a 300dpi image at 72dpi. --> a large size reduction.

What I would like to see is some special postscript delegate command that can extract rasters from a postscript file WITHOUT pixel loss due to postscript resolution scaling.

Can anyone help?
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/

Locked