Page 1 of 1

Group4 and Fax encoding issues in pdf

Posted: 2006-03-18T08:25:06-07:00
by Ivar Snaaijer
See also http://studio.imagemagick.org/discussio ... php?t=5744

I changed the file pdf.c of the latest 6.2.5 source code to reflect the changes mentioned in the post. (there are two occurences of the '<< >>' string)
Then I compiled it on my SuSE 9.3 box. when i run convert like this :

Code: Select all

ivar@home:~/projects/imagemagic/ImageMagick-6.2.5> utilities/convert -compress fax images/sprite.jpg tst.pdf[code]

I get a pdf file that is 4478 bytes that contains a CCITT encoded monochrome image.

when I change the latest 6.2.6 source to do the same I get a color image in the pdf (runlength encoded), when i use the -monochrome option i get a monochrome image that is runlength encoded.

One of the other changes between 6.2.5 and 6.2.6 are lines 652-659 that check the incoming image format, and default to runlength encoding if it is not monochrome enough, i have commented them out and compiled again.

This makes the pdf generator go bad. i tried -compress Group4 and -compress fax, both files give errors, the Group4 version does not even contain a stream, the fax version does, but generates an error.

as you can see in my other thread, in the 6.2.6-1 version there was a good group4 encoded image inside the pdf, only the '<< >>' was wrong. I'll download the 6.2.6-1 source and check.

Compiled 'out of the box'. the pdf (generated with -compress fax) obviously did not work, i removed the '<< >>' from the pdf and all is well. The filesize is 4537 bytes. 

I changed the source (removed '<< >>' from pdf.c) and the pdf opens fine. when i use -compress Group4 i get a color image that is runlength encoded, and when i add -monochrome, i get a run length encoded image.

Posted: 2006-03-18T13:13:49-07:00
by magick
The Group4-compression bug is scheduled to be fixed in the next point release. It should be available in the next 10 days.

Posted: 2006-03-18T14:20:41-07:00
by Ivar Snaaijer
Great news !

I am digging throug the code to find a possibe solution. and i come across something odd.

To tell the PDF-Reader thet it is a G3 or G4 CCITT encoding you need to set the K parameter value to -1 (for G4) or 0 (G3), it even allowes for a mixed mode (K > 0 indicating maximum sequential G4 lines).

This value stored in CCITTParam in pdf.c seems to be hardcoded (it is a #define), because of the way the encoder is programmed it looks like it can not create G4 encoding unless you #Undef UseTIFF

(taken from somewhere around line 1215)

Code: Select all

          case FaxCompression:
          case Group4Compression:
          {
            if (LocaleCompare(CCITTParam,"0") == 0)
              {
                (void) HuffmanEncodeImage(image_info,image);
                break;
              }
            (void) Huffman2DEncodeImage(image_info,image);
            break;
          }
This should probably become something like

Code: Select all

          case FaxCompression:
          
          {
            (void) HuffmanEncodeImage(image_info,image);
            break;
          }
          case Group4Compression:
          {
            (void) Huffman2DEncodeImage(image_info,image);
            break;
          }
and change the part where the string is created from somewhere in the neigberhood of 1171 and 1626

Code: Select all

      case FaxCompression:
      case Group4Compression:
      {
        (void) CopyMagickString(buffer,"/Filter [ /CCITTFaxDecode ]\n",
          MaxTextExtent);
        (void) WriteBlobString(image,buffer);
        (void) FormatMagickString(buffer,MaxTextExtent,
          "/DecodeParms [ << /K %s /Columns %ld /Rows %ld >> ]\n",
          (compression == FaxCompression) ? "0" : "-1",
          image->columns,image->rows);
        break;
      }
I hope i come around testing it.