Image signature changes if saved to a file then read back

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
bon-bon
Posts: 6
Joined: 2012-02-08T06:24:49-07:00
Authentication code: 8675308

Image signature changes if saved to a file then read back

Post by bon-bon »

Hi,

Code: Select all

img.excerpt!(x, y, width, height)
img.write(f)
new_img = Magick::Image.read(f).first
img.signature !=  new_img.signature    # why?
How can I get the value of new_img.signature from img without saving/reading it ?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Image signature changes if saved to a file then read bac

Post by anthony »

What image file format?

PNG saves a 'timestamp' see -set date:modify otpion in
http://www.imagemagick.org/Usage/formats/#png_write

JPG and GIF is likely to always change as they are lossy or have a color table that can be re-ordered without effecting the result.

Also see IM examples, Comparing, Finding duplicate images, which starts with minor file differences
http://www.imagemagick.org/Usage/compare/#doubles

It goes into greater and greater depth in its determination of how identical images are.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
bon-bon
Posts: 6
Joined: 2012-02-08T06:24:49-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by bon-bon »

The file format is TIFF

Code: Select all

img.excerpt!(x, y, width, height)
puts img.inspect
img.write(f)
new_img = Magick::Image.read(f).first
puts new_img.inspect
img.signature !=  new_img.signature    # why?
./samples/0211.tif TIFF 104x16=>4x10 4x10+0+0 DirectClass 8-bit 325b
tmp/0211.tif TIFF 4x10 4x10+0+0 PseudoClass 2c 8-bit 325b

They differ?!
bon-bon
Posts: 6
Joined: 2012-02-08T06:24:49-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by bon-bon »

I made a mistake. The signature value depends on write/read only if I do image processing.
Here the test unit with sample image.

Code: Select all

require 'rubygems'
require 'rmagick'

puts "processing file %s"%[file = ARGV.shift]
puts "  and you've choosen %sto process thresholds"%[(process = /process/i =~ ARGV.shift ) ? "" : "NOT "]

img       = Magick::ImageList.new.read(file).first
img       = img.black_threshold(200).white_threshold(20) if process

bb        = img.bounding_box
excerpt   = img.excerpt(bb.x, bb.y, bb.width, bb.height)
signature = excerpt.signature

# saving to a file and reading back
f = "excerpt-" + file
img.write(f)
new_img = Magick::Image.read(f).first

puts "  RESULT: img.signature %s new_img.signature"%( img.signature == new_img.signature ? "==" : "!=" )
Save that script to test_signature.rb and sample Image http://www.mediafire.com/i/?s9adm0ubx5tmxj2 to test_signature.tif, cd there and run:

Code: Select all

ruby test_signature.rb test_signature.tif
then

Code: Select all

ruby test_signature.rb test_signature.tif process
The result depends on wheter the image was processed or no. Why?
How to avoid file writing/reading but to get the second signature value?
Drarakel
Posts: 547
Joined: 2010-04-07T12:36:59-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by Drarakel »

Some suggestions: Perhaps you should tell at the very start which programming language/API you're using. (Perl? Ruby? Something?) And what you do in those scripts. (Seems now that you were cropping the image in the first script. And black-/white-thresholding the image in the second script. In order to do some 'image processing'.) Or - if possible - try to find a regular command at the commandline that shows the problem. And what do you want to achieve at the end?
We don't have a crystal ball that tells us these things.

After analyzing your posts, I think, you're worried about the different (verbose) information that IM returns when writing a file and when again reading it. An example at the commandline:

Code: Select all

convert -verbose signature_test.tif -black-threshold 200 -white-threshold 20 signature_test2.tif
signature_test.tif TIFF 104x16 104x16+0+0 8-bit TrueColor DirectClass 5.32KB 0.000u 0:00.031
signature_test.tif=>signature_test2.tif TIFF 104x16 104x16+0+0 8-bit Bilevel DirectClass 621B 0.016u 0:00.014

Code: Select all

identify signature_test2.tif
signature_test2.tif TIFF 104x16 104x16+0+0 1-bit Bilevel DirectClass 621B 0.000u 0:00.016

But even if these lines would be identical: This information doesn't have to show the exact file properties in every situation - it's rather an internal representation of the image, I would say.
Again: What do you want to achieve with these 'signatures'?
bon-bon
Posts: 6
Joined: 2012-02-08T06:24:49-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by bon-bon »

Yes I should had clearly stated in the beginning of my post that I used Rmagick 2.6.0 which is Ruby API to ImageMagick

I have a dozen B/W images that I consider symbols - they are figures 0..9, dot and comma.
And I need to process a lot of "raw images" consisted of those symbols. As you can see on test_signature.tif (Image) raw images contain numbers which I want to OCR from images to text form.

I compare regions of raw images to images of symbols by comparing their signatures. It works well indeed, but I need to write/read images to get proper signatures. Despite that confuses me, that also slows down image processing speed.

I found that I get different signatures for an image and for the same image being written and read from file only in case I apply thresholds to the image. I guess the signature somehow depends on the internal image representation which for unknown reason changes after image been written to a file. So I'm looking for the way to speed up image processing.

I had the script output:

Code: Select all

>ruby signature_test.rb signature_test.tif
processing file signature_test.tif
  and you've choosen NOT to process thresholds
  RESULT: img.signature == new_img.signature

>ruby signature_test.rb signature_test.tif process
processing file signature_test.tif
  and you've choosen to process thresholds
  RESULT: img.signature != new_img.signature
Drarakel
Posts: 547
Joined: 2010-04-07T12:36:59-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by Drarakel »

I'm still not sure what you mean with "signatures" here. As long as no other Rmagick users show up, we can't know what Rmagick saves as "image.signature".
You only wrote about such return lines so far:
signature_test2.tif TIFF 104x16 104x16+0+0 8-bit Bilevel DirectClass 621B
But that tells almost nothing about the content of the image. (I guess, there is a certain possiblity that the file properties are the same when compared to your images with symbols. But you will need luck for that. And the content can still differ greatly, of course.)
Or did you mean the 'hash' signatures?

Perhaps you could use some workarounds with your method. For example: If you want to avoid some differing results, then you could try to 'force' some properties. By adding "-type TrueColor -depth 8" (at the command line; I don't know what's that in Rmagick), ImageMagick won't use PseudoClass/Palette or bit depth reductions for the output files. And then, IM perhaps won't show different returns with writing/reading again.

Perhaps you should further describe the processing/comparing of your files.
And did you read the chapter about comparing in IM Examples (see Anthony's link)?
bon-bon
Posts: 6
Joined: 2012-02-08T06:24:49-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by bon-bon »

I supposed that singature is rather ImageMagick's term than Rmagik's. Here it is signature description from Rmagick docs.
img.signature -> string

Computes a message digest from an image pixel stream with an implementation of the NIST SHA-256 Message Digest algorithm. This signature uniquely identifies the image and is convenient for determining if an image has been modified or whether two images are identical.

ImageMagick adds the computed signature to the image's properties.


I added to script output of signatures of images.

Code: Select all

>ruby signature_test.rb signature_test.tif process
processing file signature_test.tif
  and you've choosen to process thresholds
  RESULT: img.signature != new_img.signature
  img.signature     = e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
  new_img.signature = 1fd61a6b86710183cd38f6e979963a9dd319862dfb542aac890db04f38d742b3
Yes I've read Anthony's link. It about IM Image Signatures I wrote about:

Code: Select all

>identify -quiet -format "%#" signature_test.tif
134dce294ff4e02507af8a986e1040417df6bb7cb722ea292b6d6d0a17bc653f
I guess a signature depends on color depth or palette
Drarakel
Posts: 547
Joined: 2010-04-07T12:36:59-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by Drarakel »

Ah, ok. So, it's the 'hash' signature.

Did you try to specify a fixed TIFF output format (with the Rmagick equivalent of "-type" and "-depth")?
I think, you should be able to get identical strings (at least in such a test script :)).

Code: Select all

convert signature_test.tif -black-threshold 78% -format "%#" -write info:- signature_test2.tif
e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba

Code: Select all

identify -format "%#" signature_test2.tif
1fd61a6b86710183cd38f6e979963a9dd319862dfb542aac890db04f38d742b3

But now:

Code: Select all

convert signature_test.tif -black-threshold 78% -type TrueColor -depth 8 -format "%#" -write info:- signature_test2.tif
e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba

Code: Select all

identify -format "%#" signature_test2.tif
e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Image signature changes if saved to a file then read bac

Post by anthony »

bon-bon wrote:The result depends on wheter the image was processed or no. Why?
ImageMagick has a flag known as 'taint' which becomes true is some pixels or meta-data was modified.
If it is not modified IM will simple copy the file. That is part of its 'delegate' handling for external filter programs.

See Delegates
http://www.imagemagick.org/Usage/files/#delegates
Specifically...
http://www.imagemagick.org/Usage/files/#delegate_direct

You can force an image to become tainted by using -taint.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
bon-bon
Posts: 6
Joined: 2012-02-08T06:24:49-07:00
Authentication code: 8675308

Re: Image signature changes if saved to a file then read bac

Post by bon-bon »

I applied img.image_type = Magick::TrueColorType. It works! Amazing!
Drarakel, thank you soo much for your help!
Anthony, thank you for your guidance!
Keep well, guys )

Code: Select all

require 'rubygems'
require 'rmagick'

puts "processing file %s"%[file = ARGV.shift]
puts "  and you've choosen %sto process thresholds"%[(process = /process/i =~ ARGV.shift ) ? "" : "NOT "]

img       = Magick::ImageList.new.read(file).first
img       = img.black_threshold(200).white_threshold(20) if process
img.image_type = Magick::TrueColorType

# saving to a file and reading back
f = "excerpt-" + file
img.write(f)
new_img = Magick::Image.read(f).first

puts "  RESULT: img.signature %s new_img.signature"%( img.signature == new_img.signature ? "==" : "!=" )
puts "  img.signature     = %s"%img.signature
puts "  new_img.signature = %s"%new_img.signature

Code: Select all

>ruby signature_test.rb signature_test.tif process
processing file signature_test.tif
  and you've choosen to process thresholds
  RESULT: img.signature == new_img.signature
  img.signature     = e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba
  new_img.signature = e9bc545d337ad96e7a149528e37bc5a2383fb9d2ac030dfac6ee96be4baa70ba

>ruby signature_test.rb signature_test.tif
processing file signature_test.tif
  and you've choosen NOT to process thresholds
  RESULT: img.signature == new_img.signature
  img.signature     = 134dce294ff4e02507af8a986e1040417df6bb7cb722ea292b6d6d0a17bc653f
  new_img.signature = 134dce294ff4e02507af8a986e1040417df6bb7cb722ea292b6d6d0a17bc653f
Post Reply