Page 1 of 1

Sorting images by colors via file name

Posted: 2016-05-29T09:43:44-07:00
by abduct
Hi, I was wondering if anyone has built a script (bash, ruby, python) which can organize images based on their colors and numerically (or other) order them via renaming the files so I can visually compare them for duplicates within a file explorer.

All these images are 1280x600 or larger and I would like to somehow maybe combine their colors into a score of some type so that I may group similar images together when viewing them chronologically after renaming them.

Is there some kind of color hash which if used as a file name would order all the images with like colors one after another? Or another way to do this?

The last script I ran took over 5 hours apparently to run because it would resize my images, and compare a single image to the rest looking for duplicates, then move onto the next image comparing to all the others one by one. I imagine just generating a hash and renaming the files would be much faster so I can just visually check for duplicates based on their colors.

Thanks.

Re: Sorting images by colors via file name

Posted: 2016-05-29T10:03:26-07:00
by snibgo
If you are looking for exact duplicates, IM's built-in hash should give that. See http://www.imagemagick.org/script/escape.php

If you are looking for near-duplicates, the task is more complex. "-features" and "-moments" may both be useful. (Beware that "-features" eats memory. I suggest images are first reduced to no more than 256 colours.)

A solution really depends on what you regard as "similar". For example, one image might be a resized crop of another image, perhaps also with tonal or colour-balance adjustment. If you want to find these, I don't know an easy method.

Re: Sorting images by colors via file name

Posted: 2016-05-29T10:30:48-07:00
by abduct
Well these images have been cropped and altered in minor ways so exact duplicate finding will not be possible.

I am looking more for a way to summarize the colors used in an image in a way that images with close color palates are named similarly.

Is there a way to sum all the colors of an image into a single number or the like? I don't need IM to find the duplicates itself, just a way to organize each image in the directory from one end of the color spectrum to another. For instance all the redish images will be grouped together, then the blueish, then the greenish, with grouping happening via the file explorer ordering the images by file name (chronologically).

Edit:: I found something like this which might be more useful to what I am trying to do:

Code: Select all

convert image1.jpg -scal 1x1\! txt:-
# ImageMagick pixel enumeration: 1,1,255,srgb
0,0: (65,87,117)  #415775  srgb(65,87,117)

convert image2.jpg -scale 1x1\! txt:-
# ImageMagick pixel enumeration: 1,1,255,srgb
0,0: (73,96,121)  #496079  srgb(73,96,121)
Then maybe sum the two RBG values resulting in 269 and 290. By using these these two similarly colored imaged will be viewed side by side in the file explorer do to chronological listing.

Is there a way to do this more accurately with less duplicate sums than reducing to a single pixel (I can image with lots of images of similar colors I would hit a collision more often than not)? Or should I just append a random string to the end of the calculated sum if a duplicate is found?

Re: Sorting images by colors via file name

Posted: 2016-05-29T11:06:58-07:00
by snibgo
Summing R+G+B seems a lousy hash for comparisons, as 255+0+0 = 0+0+255, so a pure red image would be "close to" a pure blue image.

You might simplify the image before calculating the hash. There are two main methods: "-colors N" which finds the N best colours, but they could be anything, and "-posterize N" which changes each pixel to one from a fixed palette of N^3 colours.

Then you can find the most common colour within the image.

Perhaps you care only about the hue, and don't care about saturation or lightness. If so, convert to HCL or similar colorspace.

For more focussed guidance, you could give examples of images you consider similar and different.

Re: Sorting images by colors via file name

Posted: 2016-05-29T11:15:54-07:00
by abduct
Here are some images:
http://i.imgur.com/0c7h0r6.jpg
http://i.imgur.com/Lha5QyD.jpg
http://i.imgur.com/21ws8NG.jpg
http://i.imgur.com/dBJy1T0.jpg
http://i.imgur.com/X3VPv9m.jpg
http://i.imgur.com/5vejG2S.jpg
http://i.imgur.com/2osMWn4.jpg

They should be ordered something like:
http://i.imgur.com/dBJy1T0.jpg
http://i.imgur.com/5vejG2S.jpg
http://i.imgur.com/Lha5QyD.jpg
http://i.imgur.com/X3VPv9m.jpg
http://i.imgur.com/21ws8NG.jpg
http://i.imgur.com/0c7h0r6.jpg
http://i.imgur.com/2osMWn4.jpg

The order of colors (brightness, saturation) does not so much matter other than the redish, orangish, blueish, etc are grouped together.

I don't care for the brightness or saturation so yes, hue might be what I am after for. How would I convert it to a HCL color space, then maybe posterize the image and calculate some kind of color sum to be used as the file name.

Re: Sorting images by colors via file name

Posted: 2016-05-29T11:31:36-07:00
by fmw42
Please always provide your IM version and platform, since syntax may differ

Re: Sorting images by colors via file name

Posted: 2016-05-29T11:33:20-07:00
by abduct
Linux, Version: ImageMagick 6.9.0-3 Q16 x86_64 2015-04-28 http://www.imagemagick.org

Re: Sorting images by colors via file name

Posted: 2016-06-03T19:35:09-07:00
by abduct
I finally got around to making a quick script for this.

This doesn't organize the images by color very well (some segments it groups them fine, others not so much) but as a result of how I hashed the RBG values (concat them together) similar or duplicate images share the same or very similar color values causing them to be almost or exactly side by side when viewing them in a file explorer.

What the script does is iterate all files twice in the current directory that the script is ran in. First looking for PNG, then JPG. It runs convert on them and concats the RBG values, then tests to see if the file name exists. If it exists it forever loops until a untaken file name is found via concating a random number between 1 and 10 (this range should be increased if you have lots of images). Once it finds a name it simply moves the file to rename it.

Here is an example of what it looks like in a file explorer. As you can see it found a duplicate image, and that all the colors (for this segment at least) are more or less organized):

Image

Just be aware that this tool is very crude and doesn't organize all the images perfectly.

Code: Select all

for file in *.png
do
  filename=`convert $file -scale 1x1\! txt:- | tail -n 1 | awk -F\( '{print $2}'|cut -d\) -f1|awk -F\, '{print $1$2$3}'`

  extension=".png"
  while [ -f "$filename$extension" ]
  do
    random=`echo $RANDOM % 10 + 1 | bc`
    filename=$filename$random
  done

  mv $file $filename$extension
done

for file in *.jpg
do
  filename=`convert $file -scale 1x1\! txt:- | tail -n 1 | awk -F\( '{print $2}'|cut -d\) -f1|awk -F\, '{print $1$2$3}'`

  extension=".jpg"
  while [ -f "$filename$extension" ]
  do
    random=`echo $RANDOM % 10 + 1 | bc`
    filename=$filename$random
  done

  mv $file $filename$extension
done
If anyone has any suggestions on how to do this better, please let me know!

Re: Sorting images by colors via file name

Posted: 2018-03-29T01:52:00-07:00
by blinkybagger
(tumbleweeds)
Was looking at your script, which I ran, but I'm a bit unsure of what the expected inputs were and what the default view of your explorer window might be. ls -ltr tends to be the unthinking way that I review the contents of a directory, though file globbing might be something else which may prompt file touching routines, or renaming of files using file counters.

I'd been thinking about using a file name prefix containing RGB triplets, so as to retain the original file name which often contains clues about how a given file has been processed and indeed the file subject, though perhaps this is naive and the directory should do that (but don't want to have to tell the Sys Admins how we need to structure anything). The filename prefix logic being akin to *nix file permissions, though with 255 values rather than 7 (um 8) values...and it might take some familiarity with AWK to utilize the outputs (such as iterating over a collection of images looking for the next image in sequence using a step size of whatever).

This is just thinking aloud, before I try scripting something.
Thanks for documenting and sharing what you did!

Re: Sorting images by colors via file name

Posted: 2018-03-29T04:17:47-07:00
by blinkybagger
So converting script to RGB and using pre-existing filename in the output filename, I now have to determine if this is of any use to me.

Code: Select all

#!/bin/bash
und="_"
for file in *.png
do
  filename_prefix=`convert $file -scale 1x1 -format "%c" histogram:info: | tail -n 1 | awk -F\( '{print $2}'|cut -d\) -f1|awk -F"," '{printf "%03d,%03d,%03d\n", $1 , $2 , $3 }'`
  filename_suffix=`echo $file | cut -d'.' -f1 `
  extension=".png"
  cp  $file $filename_prefix$filename_suffix$extension
done

for file in *.jpg
do
  filename_prefix=`convert $file -scale 1x1 -format "%c" histogram:info: | tail -n 1 | awk -F\( '{print $2}'|cut -d\) -f1|awk -F"," '{printf "%03d,%03d,%03d\n", $1 , $2 , $3 }'`
  filename_suffix=`echo $file | cut -d'.' -f1 `
  extension=".jpg"
  cp $file $filename_prefix$und$filename_suffix$extension
done