Page 1 of 1

How phashcompare works?

Posted: 2015-05-08T06:00:49-07:00
by Hyperion
Hello, I ask this question because I've expected something strange using the script. I've compared some images (frames of a video) took from 2 different video files, one in 1080p HD and another one in a really bad quality and downscaled. Now comparing 2 exact frames at the same second I get 15 as comparing value, but when I try to compare 2 frames took from 2 different shots (same scene but change of camera, so same actors, same colors in the scene etc.) I get 20, AND if I compare this same 2 different frames but both taken from the HD video I get 13!. So how the script compare the phash values? And which value can be considered "a good match"?

Re: How phashcompare works?

Posted: 2015-05-08T10:00:38-07:00
by fmw42
I have only done a limited set of test, so make no claim to how well it works. See http://www.fmwconcepts.com/misc_tests/p ... index.html

The idea is that the hash should be insensitive to similar images under a number of attacks and only sensitive to different images. Larger values imply different images, at least that was the idea, but they need really to be different. The phash should not be too sensitive to changes in scale or rotation as well as other distortions and noise and brightness/contrast.

Re: How phashcompare works?

Posted: 2015-05-08T10:17:07-07:00
by Hyperion
Thank you for the answer. Do you think that calculating a simple hamming distance between the 2 hashes can give the same result for comparison accuracy? I say this because I see that the comparison script is a little bit slow, or at least too slow to compare a lot of hashes like a database.

Re: How phashcompare works?

Posted: 2015-05-08T10:20:57-07:00
by fmw42
No, you cannot use the hamming distance. The phash creates float values that have to be compared using sum squared difference. The phashconvert converts the floats to a string of digits. The phashcompare reverses the string of digits into the floats again and the does the sum squared difference measure.

use

identify -verbose -moments image

to see the actual float values computed that go into the phash computation.

http://www.imagemagick.org/script/identify.php