[We'll look to improve algo, time permitting -Magick] [workaround: -evaluate Log] count colors takes 13 times longer

Post any defects you find in the released or beta versions of the ImageMagick software here. Include the ImageMagick version, OS, and any command-line required to reproduce the problem. Got a patch for a bug? Post it here.
Post Reply
Dabrosny
Posts: 111
Joined: 2013-10-02T10:49:39-07:00
Authentication code: 6789
Location: New York, US

[We'll look to improve algo, time permitting -Magick] [workaround: -evaluate Log] count colors takes 13 times longer

Post by Dabrosny »

In some cases it takes an extremely long time (over 3 minutes) to calculate the number of unique colors for a very small (512x768) hdri image, while in other cases of just a slightly different version of the image, it takes "just" 15 seconds.

In this case it takes "just" 15 seconds:
time magick www.pauldebevec.com_Research_HDR_memorial.hdr -evaluate Multiply 0.5 -format '%k\n' -identify null:
265903
real 0m15.526s
user 0m0.092s
sys 0m0.061s

But in this case it takes over 3 minutes:
time magick www.pauldebevec.com_Research_HDR_memorial.hdr -evaluate Multiply 0.0625 -format '%k\n' -identify null:
265903
real 3m16.052s
user 0m0.046s
sys 0m0.108s

You can obtain this famous hdri file as from www.pauldebevec.com/Research/HDR/memorial.hdr

I am running the Windows binary under Windows 7. I happen to be invoking it from the cygwin command line.

magick -version
Version: ImageMagick 7.0.6-2 Q16 x64 2017-07-22 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC HDRI Modules OpenMP
Delegates (built-in): bzlib cairo flif freetype jng jp2 jpeg lcms lqr openexr pangocairo png ps rsvg tiff webp xml zlib
Last edited by Dabrosny on 2017-08-24T06:36:59-07:00, edited 3 times in total.
-Dabrosny [Using IM7.0.6 or higher, Q16 HDRI x64 native executable, command line, often invoked from cygwin bash/sh (as of Aug. 2017)]
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: count unique colors (%k) takes 13 times as long on similar image

Post by snibgo »

For me, on Windows 8.1, IM v7.0.1-0 Q32 HDRI (built with Cygwin tools) takes about 6 seconds. On v6.9.3-7 takes about 3 seconds.
snibgo's IM pages: im.snibgo.com
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: count unique colors (%k) takes 13 times as long on similar image

Post by magick »

The algorithm is slow when there are a huge number of unique colors as it iterates over the existing color list to determine if the next pixel color is unique. We should sort the pixels and use a bilinear search to reduce the search time. We'll look into improving the algorithm as time permits.
Dabrosny
Posts: 111
Joined: 2013-10-02T10:49:39-07:00
Authentication code: 6789
Location: New York, US

Re: count unique colors (%k) takes 13 times as long on similar image

Post by Dabrosny »

Thanks, @magick. Why would scaling the same small image with the same number of colors (265903) make it take 13 times as long? I'd be happy to scale it to where it is faster if I knew what scale that was :-) And @snibgo sees no such problem... could I be on the edge of some memory limit?
-Dabrosny [Using IM7.0.6 or higher, Q16 HDRI x64 native executable, command line, often invoked from cygwin bash/sh (as of Aug. 2017)]
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: count unique colors (%k) takes 13 times as long on similar image

Post by snibgo »

Sorry, I hadn't realised your point was that multiplying by 0.0625 make the count much slower.

Yes, that's true on the versions I tested. On v6.9.3-7, multiplying by 1, it takes 3 seconds. Multiply by 0.0625 it takes 56 seconds.

The image has a wide range, from 0 to 274*QuantumRange, with a mean of 0.3*QuantumRange. So a small number of pixels are whiter than white, higher than 100%, but they are much higher. But multiplying by 0.0625 squishes most values into a small range. (Thus into a tiny number of nodes in the ColorCube tree.)

"-auto-gamma" or "-contrast-stretch 0x0.01" both expand the squished values, but give a count 264311, which is less than the expected 265903.

Okay, here's a method. After multiplying by 0.0625 or whatever, find the mean. Calculate (0.5 / mean). This gives a factor which, when you multiply the image by this, gives a mean of 0.5. For example:

Code: Select all

magick memorial.hdr -format %k\n -evaluate multiply 0.0625 -evaluate Multiply %[fx:0.5/mean] -identify null:
This takes 4 seconds (on v7.0.1-0, Q32 HDRI) to give the expected result 265903.

[EDITED. Sorry.]
snibgo's IM pages: im.snibgo.com
Dabrosny
Posts: 111
Joined: 2013-10-02T10:49:39-07:00
Authentication code: 6789
Location: New York, US

Re: count unique colors (%k) takes 13 times as long on similar image

Post by Dabrosny »

Thanks, @snibgo, great workaround! I wonder if this idea could be incorporated into the way IM counts the colors to keep the time from exploding (without the user having to know how to do this workaround), at least until a hash table or bilinear search (as @magick mentions) could be implemented at some point.

I'm surprised that the number of colors changes with -auto-gamma (or -auto-level) when we're using floating point. Is it truly counting the number of distinct floating point values or is it quantizing them?
-Dabrosny [Using IM7.0.6 or higher, Q16 HDRI x64 native executable, command line, often invoked from cygwin bash/sh (as of Aug. 2017)]
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: count unique colors (%k) takes 13 times as long on similar image

Post by snibgo »

Dabrosny wrote:I'm surprised that the number of colors changes with -auto-gamma ...
I haven't investigated the cause. I suspect we get quantization of the very high values. Even Q32 HDRI has only finite precision.


IM's colour-counting uses a 3-D binary tree of cubes, starting at the root cube that compasses the range 0 to 100%, chopping the cube in the three dimensions making 2*2*2=8 nodes at each level, up to 8 levels, but sparsely populated. If there is more than one unique colour in a node, they are listed, and that's where linear searching is needed.

This works efficiently for images that have only 8 bits/channel/pixel.

It also works reasonably well for images that have values spread evenly in the range 0 to 100%.

But when, as in your case, almost all values are in the narrow range 0 to 6.25%, there will be a small number of nodes each with a long list, and we need to search one long list at every pixel.

Two solutions occur to me:

(1) Allow more than 8 levels (yours would need 14).

(2) Have each list sorted, say on intensity.

Meanwhile, I can't think of any pre-processing that would cover all circumstances. Your image has only positive values, with most values near zero and a very long thin tail of high values, so "multiply to make the mean 50%" is a sensible solution.

I think this would also work on any positive-only image, such as integral images that have values evenly spread between 25% and 33000 * QuantumRange, with a mean of 7000*QuantumRange. (I've tested this, and it works, reducing the time from 6.67 seconds to 0.58 seconds.)

It wouldn't work on images that also have negative values.

However, a more general process could be written that took any arbitrary HDRI image and manipulated it to make (say) 90% of values fall in the range 0 to 100%.
snibgo's IM pages: im.snibgo.com
Dabrosny
Posts: 111
Joined: 2013-10-02T10:49:39-07:00
Authentication code: 6789
Location: New York, US

Re: count unique colors (%k) takes 13 times as long on similar image

Post by Dabrosny »

Thanks, @snibgo,

I've also seen some images where 0.5/mean workaround doesn't do the trick.
Without a reliable way of counting colors in even a small hdri image in IM, I've found that a command like this works quite well and quickly in cygwin (and it's easy to put this into a shell script):

magick memorial.hdr -evaluate Multiply 0.0625 txt:- | awk '{print $2}' |egrep "^[(]" | time sort -u |wc |awk '{print $1}'
265903

0.53s user 0.10s system 7.80s elapsed

I realize this isn't a universal workaround across IM platforms, but it's the best thing I've got for now. :-)
-Dabrosny [Using IM7.0.6 or higher, Q16 HDRI x64 native executable, command line, often invoked from cygwin bash/sh (as of Aug. 2017)]
Dabrosny
Posts: 111
Joined: 2013-10-02T10:49:39-07:00
Authentication code: 6789
Location: New York, US

Re: [We'll look to improve algo, time permitting, sez Magick] count unique colors (%k) takes 13 times as long on similar

Post by Dabrosny »

snibgo wrote: 2017-07-24T08:54:21-07:00 Meanwhile, I can't think of any pre-processing that would cover all circumstances. Your image has only positive values, with most values near zero and a very long thin tail of high values, so "multiply to make the mean 50%" is a sensible solution.
An hdri image is typically well-spread-out in logarithmic space across several orders of magnitude, and thus on the nonlogarithmic scale it's bunched up down low with a long tail up high.

So, using -evaluate Log seems to work better for me as a workaround to speed up the counting, almost always without affecting the result.
For example:

Code: Select all

magick myimage.hdr -auto-level -evaluate Log 2000 -format %k\n info:
The -auto-level prevents negative values; it may scale down the values by quite a bit for an hdri image, concentrating them even closer to zero, but the logarithm then spreads them out adequately well across a big enough portion of the 0-1 range.

As for the value of 2000, I've actually had success using just about any value between around 100 and 10,000.
-Dabrosny [Using IM7.0.6 or higher, Q16 HDRI x64 native executable, command line, often invoked from cygwin bash/sh (as of Aug. 2017)]
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: [We'll look to improve algo, time permitting -Magick] [workaround: -evaluate Log] count colors takes 13 times as lon

Post by snibgo »

Excellent. Thanks for the update.
snibgo's IM pages: im.snibgo.com
Post Reply