histogram analysis and image manipulation

seosamh · Post by **seosamh** » 2014-07-08T08:13:20-07:00

Hi All,

I wonder if you can help confirm something for me, we've a website using imagemagick to basic automatically level/brighten/color correct images to be used for print, but it's incredibly slow. Essentially the person that set this up has now left, so it's down to me to try and figure out how to get the site running faster(in relation to the image processing.)

Now I've been speaking to the developers trying to get some information on what is exactly happening and I've been given the response below:

The process is as follows, though the implementation of this is python unless otherwise stated.

1. The histogram is generated using the following imagemagick command:
convert -shave 3% -sample 100 -format %c rgb.jpg histogram:info:-

2. You'll see that this produces output in the format:
"3: (121,121,145, 96) #79799160 cmyk(121,121,145,96)"

The first 5 integers are parsed from this, i.e.
"3 121 121 145 96"

Where 3 is the pixel count of occurrences of CMYK (121,121,145,96).
These parsed integers are used to create histograms of each colour.

3. The histogram for each colour is sorted in ascending value of intensity.

4. A threshold value is generated for the image using the total number of pixels multiplied by 0.4.

4.1. We iterate over the sorted histogram of a colour adding the number of pixels of each intensity to a running total.
4.2. When the running total is greater than the threshold value calculated at (4) a cutoff value (c_v) for this colour is calculated using the current intensity divided by 256.

5. The cutoff value for each colour is used to generate the clipping formulae for imagemagick e.g. (1/c_v)*u+1-(1/c_v) These are the values you see in the commands I sent over.

Now the value above get passed into this command which does the image manipulation:

The command generated using the CMYK histograms:
convert -channel Y -fx "(1/0.965)*u+1-(1/0.965)" -channel C -fx "(1/0.598)*u+1-(1/0.598)" -channel M -fx "(1/0.793)*u+1-(1/0.793)" -channel K -fx "(1/0.988)*u+1-(1/0.988)" ./cmyk.jpg ./cmyk_adjusted.jpg

Now I've tried both commands above on my desktop, the second seems reasonably fast.

but the first, this one:

convert -shave 3% -sample 100 -format %c rgb.jpg histogram:info:-

takes absolutely ages, can vary between 6 and 28 seconds generating all the values(it spits out thousands CMYk values).

So my instinct is that this is where the bottleneck is happening. Does that sound about right?

Now, i've a few questions. Is there any way to speed up the histogram analysis to be able to generate the numbers to pass into the second command?

Does the above process sound about the correct way to be doing things, or should I be looking at something else?

Essentially the place where the above happens, is when a user selects an image and it runs this process there and then, unfortunately I think that this will need to stay as we handle loads of images, but only a small percentage of images get used in generating print.

Sorry if my message above seems abit confusing, but I'm a complete beginner when it comes to imagemagick, I can run simple commands through a command prompt, but that's about it.

I guess what I'm really asking for above is:

a. Is there a faster way to do the histogram analysis?

b. if not are there other processes I should be looking at that will do the same as the above, but run faster(the ultimate goal here is to speed up the end user experience on the system(I know the above isn't perfect from an image manipulation stand point but consider the crap that gets feed into the system, i does help stave of the worst of our problem))

sorry all for the long vague post, but if you have any thoughts on the above I'd be delighted to hear them.

Thanks

Joe

Post by **fmw42** » 2014-07-08T09:29:59-07:00

What version of IM are you using? What platform?

How large is your input image? I am surprised that you say it takes too long to generate a histogram? Did you add -depth 8 to speed up processing of 16 bit images? It is likely your processing of the histogram that takes a long time.

Perhaps just use -contrast-stretch to stretch at specified cumulative counts.

This may be equivalent to what you appear to be doing. Using 40% from each end of the histogram for each channel

Code: Select all

convert image.jpg -channel cmyk -contrast-stretch 40,40% result.png

-channel cmyk causes -contrast-stretch to process each channel separately.

see
http://www.imagemagick.org/script/comma ... st-stretch

Note with IM 6, the input image should properly come right after convert. see http://www.imagemagick.org/Usage/basics/#why

If on Linux, Mac OSX or Windows with Cygwin, you could look at my script cumhist at the link below, though likely limited to rgb images. I use AWK to generate the cumulative histogram rather quickly in this script. Also see my script, omnistretch, but here again it is likely limited to rgb images.

derHase · Post by **derHase** » 2015-02-19T04:57:03-07:00

seosamh wrote: ...
3. The histogram for each colour is sorted in ascending value of intensity.

4. A threshold value is generated for the image using the total number of pixels multiplied by 0.4.

4.1. We iterate over the sorted histogram of a colour adding the number of pixels of each intensity to a running total.
4.2. When the running total is greater than the threshold value calculated at (4) a cutoff value (c_v) for this colour is calculated using the current intensity divided by 256.

5. The cutoff value for each colour is used to generate the clipping formulae for imagemagick e.g. (1/c_v)*u+1-(1/c_v) These are the values you see in the commands I sent over.
...

Hi Joe

I've tried to understand how your script proceeds with the values in 4.1. Let's
start again in 3, that's clear to me: the histogram is sorted for each colour
with ascending intensity, so let's say for cyan there are 23 pixels with intensity 245
and another 56 pixels again with intensity 245 (since in the histogram the other
colour values differ). OK, 4 is telling me that you build a threshold value as
40 percent of your total number of pixels in your picture.
And now I'm lost (4.1/4.2): Your script then iterates over the histogram
of each colour, adding the number of pixels of each intensity to a running total. But
when I do that - adding the number of pixels all into one running total - I'll get the
total number of pixels of the picture, which is always above the threshold value.
Then I thought - well, maybe the running total should be weighted, so e.g. I'll multi-
ply the number of pixels of each intensity with the intensity (which is something of
0..255) and afterwards I'll devide this sum by 256 to get an average value which
might be below the threshold or above it. But that's not what your script does,
right?

Please explain what your script does at point 4.1/4.2, I'll really try to
understand it. My current problem is that I have to get image information of a bigger
picture then use a cutting tool to separate parts of the picture (getting about 8 little
pieces) and to use the image information of the big picture to be able to handle all
little pieces in the same manner (e.g. using your calculated colour clipping values).

Have you tried different kinds of jpegs? E.g. is there a difference between "pro-
gressive" implementation of the jpeg or the common one?

Regards,
Martin

Post by **fmw42** » 2015-02-19T14:20:20-07:00

Although this probably won't help in IM 6, the proper syntax is to have the input image right after convert, not near the end. In IM 7 it will likely fail.

Again, how big is your input image in width and height (not filesize), since IM will decompress any jpg to full 24bit color.

What version of IM are you using and what platform? If you IM version is old, you should upgrade.

You can use jpg hints to read the jpg faster. see http://www.imagemagick.org/Usage/formats/#jpg_read

If you provide your input jpg to dropbox.com and put the URL here, we can test the speed of your command line?

If you have multiple core processors, then you may get speedup by enabling OpenMP, though by default, I think it is enabled.

Some Linux systems, on certain functions, may run faster with only 1 thread, if you have OpenMP enabled. But I do not recall which ones. But it is easy to try, by using -limit thread 1 at the beginning of the command line right after convert.

Other ways to speed up your processing would be to use Q8 compiled IM rather than the default Q16. It uses less memory than Q16.

Legacy ImageMagick Discussions Archive

histogram analysis and image manipulation

histogram analysis and image manipulation

Re: histogram analysis and image manipulation

Re: histogram analysis and image manipulation

Re: histogram analysis and image manipulation