Page 1 of 2

Turning performance of signature hashing

Posted: 2019-05-05T10:21:22-07:00
by mbrijun
Hi,

I am trying to create signatures of around 70,000 photos, mostly JPEG and Nikon NEF. The version is "ImageMagick-7.0.8-44-Q16-x64-dll" on Windows 10. The configuration is out-of-the box, with no tweaks to the XML configuration files. The CPU is i7-2600 @ 3.40GHz with 8GB RAM.

It takes in the region of 2 seconds for "magick identify -format %#" to return the signature for each 6-8MP file, when run from the command line. I have tried wrapping the whole process into a python script and using multi-threading/multi-processing, with mixed results. The best I can get is around 0.5 seconds per image on average.

Please help me understand if there is anything I can do to improve the performance? Q16 is required as I am dealing with NEF files, so a downgrade to Q8 is not an option.

Thank you.

Re: Turning performance of signature hashing

Posted: 2019-05-05T10:46:51-07:00
by snibgo
mbrijun wrote:It takes in the region of 2 seconds for "magick identify -format %#" to return the signature for each 6-8MP file ...
I suspect you have CPU power and memory that is unused, and that using this will increase performance.

How many threads are used? The Task Manager (command-line "taskmgr") will tell you. How many threads do you have available? How long does it take when you insert "-limit thread 1" as the first part of the magick command? I suspect this isn't much longer that 2 seconds, which would mean that multi-tasking should help.

When you use multi-threading/multi-processing for 0.5 sec/image, is your CPU saturated? Does "-limit thread 1" help?

Re: Turning performance of signature hashing

Posted: 2019-05-05T12:47:27-07:00
by mbrijun
snibgo, thanks for your prompt reply. I did some testing from the command line, with and without "-limit thread 1". The Windows task manager reports the thread count as 2 in *both* cases, with the average CPU usage of 12%. The directory with images was placed on an SSD drive to rule out disk I/O.

https://1drv.ms/u/s!AvJkQjHuwG2jhsV6xML9r6pL2yse5Q

Re: Turning performance of signature hashing

Posted: 2019-05-05T13:05:24-07:00
by fmw42
You will probably get better performance if you limit threads to 1 and issue two processed at the same time, each for a different image. So break your list of images into two and do a loop, one for each list in two different running scripts.

Re: Turning performance of signature hashing

Posted: 2019-05-06T01:38:19-07:00
by mbrijun
Did some more experimenting this morning. The "-limit thread 1" does not appear to have any impact at all. The use of multi-threading in my code scales the performance in a linear fashion, as long as the number of threads in the code is less than the threads available on the CPU. Any additional threads in the code decrease the performance.

The move from IM 7.0.8-44 to 6.9.10-44 has made the same code run 4 times (!) faster.

Re: Turning performance of signature hashing

Posted: 2019-05-06T07:16:04-07:00
by snibgo
I suspect your v7 is HDRI (floating-point) but your v6 is integer, and that this makes more difference than v6 or v7 as such.

Re: Turning performance of signature hashing

Posted: 2019-05-06T07:51:40-07:00
by mbrijun
This is the name of the installer file: "ImageMagick-7.0.8-44-Q16-x64-dll.exe". This is the "recommended" download under the Windows section. I can see that the HDRI-enabled versions have "HDRI" in their file name of the installer.

Re: Turning performance of signature hashing

Posted: 2019-05-08T00:03:35-07:00
by mbrijun
Maybe this belongs in the "bugs" section? I can confirm that the 4x speed drop is assosiated only with the change in version and not with HDRI.

Re: Turning performance of signature hashing

Posted: 2019-05-09T09:58:42-07:00
by dlemstra
What is the output of convert -version and magick -version?

Re: Turning performance of signature hashing

Posted: 2019-05-10T05:10:19-07:00
by magick
The image signature was redesigned to be Q-level and byte-order invariant. Meaning the image signature is the same whether the ImageMagick build is Q8, Q16, Q32 or HDRI-enabled or not, or on a LSB or MSB-first order architecture. This is not always the case for IMv6, and IMv7 was designed to address deficiencies we found in IMv6. To return the same signature across all these Q-level builds and endianess, we had to switch from integer to normalized double computation and that is likely the source of the performance hit. Floating point computations are almost always slower than integer.

The signature algorithm is single threaded. It runs in one thread and cannot take advantage of multiple cores on the host.

Re: Turning performance of signature hashing

Posted: 2019-05-11T08:15:37-07:00
by mbrijun
@dlemstra - the output of the version information is this:

The "faster" version (IM6):

Code: Select all

>identify -version
Version: ImageMagick 6.9.10-44 Q16 x64 2019-05-04 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP(2.0)
Delegates (built-in): bzlib cairo flif freetype gslib heic jng jp2 jpeg lcms lqr lzma openexr pangocairo png ps raw rsvg tiff webp xml zlib
The "slower" version (IM7):

Code: Select all

>magick -version
Version: ImageMagick 7.0.8-44 Q16 x64 2019-05-04 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2018 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP(2.0)
Delegates (built-in): bzlib cairo flif freetype gslib heic jng jp2 jpeg lcms lqr lzma openexr pangocairo png ps raw rsvg tiff webp xml zlib
@magick - I have tried both Q8 and Q16 versions of IM7 on the same image, they seem to produce a different signature.

Output for Q16:

Code: Select all

>magick -version
Version: ImageMagick 7.0.8-44 Q16 x64 2019-05-04 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2018 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP(2.0)
Delegates (built-in): bzlib cairo flif freetype gslib heic jng jp2 jpeg lcms lqr lzma openexr pangocairo png ps raw rsvg tiff webp xml zlib

>magick identify -format %# u:\test\test.jpg
f75a205b5ebd0e209aa53c2357cf7d3e06957b459ceaf44bee183b24b4db869f
Output for Q8:

Code: Select all

>magick -version
Version: ImageMagick 7.0.8-44 Q8 x64 2019-05-04 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2018 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP(2.0)
Delegates (built-in): bzlib cairo flif freetype gslib heic jng jp2 jpeg lcms lqr lzma openexr pangocairo png ps raw rsvg tiff webp xml zlib

>magick identify -format %# u:\test\test.jpg
6320e9c76391e054704c67a405ba177486dfcc854aafad302fc7a638fe6c6bbe

Re: Turning performance of signature hashing

Posted: 2019-05-11T10:34:52-07:00
by magick
Thanks for the problem report. We can reproduce it and will have a patch to fix it in GIT master branch @ https://github.com/ImageMagick/ImageMagick later today. The patch will be available in the beta releases of ImageMagick @ https://www.imagemagick.org/download/beta/ by sometime tomorrow.

Re: Turning performance of signature hashing

Posted: 2019-05-12T08:49:32-07:00
by mbrijun
@magick - thank you for looking into this. I have checked the latest available beta just now, and it seems to still generate different signatures between Q8 and Q16.

Code: Select all

>magick -version
Version: ImageMagick 7.0.8-45 Q16 x64 2019-05-11 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2018 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP(2.0)
Delegates (built-in): bzlib cairo flif freetype gslib heic jng jp2 jpeg lcms lqr lzma openexr pangocairo png ps raw rsvg tiff webp xml zlib

Re: Turning performance of signature hashing

Posted: 2019-05-12T17:04:52-07:00
by magick
Post a link to your image. We need to download it and reproduce the problem. We have unit tests against Q8, Q16, and Q16HDRI and they all return the same image signature for each image we tested.

Re: Turning performance of signature hashing

Posted: 2019-05-12T23:40:30-07:00
by mbrijun
@magick - please say "hi" to Pistachio who greets tourists coming to a Dead Sea resort Kalia Beach. I hope this will help unifying the signature across multiple flavours of IM.

https://1drv.ms/u/s!AvJkQjHuwG2jhrF7URqE3uuH9Ik39g